kamiyaa / ruiji

Reverse anime image searching program
GNU Lesser General Public License v3.0
48 stars 8 forks source link

Parser falls over on unexpected HTML #6

Open Fuuzetsu opened 7 years ago

Fuuzetsu commented 7 years ago

parse_xy_img_dimensions segfaults unless the input is in exactly the expected format; this is both due to not checking that strstr is not returning null as well as not checking return code of sscanf for number parsing. parse_percent_similar has exactly the same issue I expect. While the case may seem contrived, the segfault will happen if one of the domains changes the format of their output.

I attach some inputs generated by AFL. You can replicate by feeding these as html_data in ruiji.c. ruiji-crashes.tar.gz

kamiyaa commented 7 years ago

Are you still able reproduce this with latest commit: 85b0940e80c0011475ac14c009fbe5d3479c5f65 ? note, I changed html_data to iqdb_html somewhere in between.

I wasn't able to reproduce segfaults by changing html_data to the given inputs you provided with 9891e48c9fb651ef0ab4fb86e8a679f5bbcb3703 This is the output I got:

./ruiji -T makise.jpg 
Uploading makise.jpg to https://iqdb.org...

[0]
source: https://danbooru.donmai.us/postg
similarity: 0%
dimensions: 100x0

[1]
source: https:images.sankakucomplex.com/gfx/favicon.png
similarity: 9%
dimensions: 100x0

[2]
source: http://e-net/special/favicon.ico
similarity: 96%
dimensions: 10x0

[3]
source: http://www.zerochan.net/2015451n.ico
similarity: 43%
dimensions: 3x0

Which one would you like to download? (-1 to exit): -1
Error: Invalid option selected
Fuuzetsu commented 7 years ago

I was unable to reproduce any more crashes in that part of the code. I found multiple issues in tag parsers however. I will create a separate issue for that I think.

Fuuzetsu commented 7 years ago

Sorry I lied, here's an input it will crash on, on master. in2.txt

To replicate simply add


char* load_file(char *file_name)
{
  FILE * pFile;
  long lSize;
  char * buffer;
  size_t result;

  pFile = fopen ( file_name , "rb" );
  if (pFile==NULL) {fputs ("File error",stderr); exit (1);}

  // obtain file size:
  fseek (pFile , 0 , SEEK_END);
  lSize = ftell (pFile);
  rewind (pFile);

  // allocate memory to contain the whole file:
  buffer = (char*) malloc (sizeof(char)*lSize);
  if (buffer == NULL) {fputs ("Memory error",stderr); exit (2);}

  // copy the file into the buffer:
  result = fread (buffer,1,lSize,pFile);
  if (result != lSize) {fputs ("Reading error",stderr); exit (3);}

  /* the whole file is now loaded in the memory buffer. */
  fclose (pFile);
  return buffer;

}

int main(int argc, char *argv[]) {
  char *html_content = load_file(argv[1]);

  struct similar_image_llnode *image_list =
    create_image_list(html_content, 0);
  print_sim_results(image_list);
  free_similar_image_list(image_list);

  /*
  char stop_seq = '\0';
  for (int i = 1; i <= 8; i++) {
    unsigned int domain_uuid = i;
    char *dl_url = get_image_source_url(domain_uuid, html_content, &stop_seq);
    struct image_tag_db *tags_db = get_image_tags(domain_uuid, html_content);
    printf("Tags:\n");
    print_image_tags(tags_db);
    free_image_tags(tags_db);
    free(dl_url);
  }
*/
  free(html_content);
  return 0;
}

and feed the file in.