jhawkey / IS_mapper

IS mapping software
Other
51 stars 16 forks source link

Unable to compile one table for all results from compiled_table.py? #49

Closed safinaARK closed 2 years ago

safinaARK commented 2 years ago

Dear Hawkey,

I have run ismap on 52 samples and have created results in multiple folders for each IS sequence provided in the --query option. I would like to compile this into one table but getting the following error:

`Traceback (most recent call last):
  File "/home/sar/miniconda3/bin/compiled_table.py", line 520, in <module>
    main()
  File "/home/sar/miniconda3/bin/compiled_table.py", line 499, in main
    gb = SeqIO.read(args.reference, "genbank")
  File "/home/sar/miniconda3/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line 661, in read
    raise ValueError("More than one record found in handle")
ValueError: More than one record found in handle`

i run the following command:

compiled_table.py --tables `find . -name '*1_table.txt'` --reference ../../../flex.gbk --query mnt/e/Working/Shigella/Data/IS.database.fa --out_prefix combined

where find . -name '*1_table.txt will give all teh tables from 52 samples from 35 IS sequences. means in each sample folder 35 IS seq subfolders are made.

please help

Thank you

SAR

jhawkey commented 2 years ago

Hi,

I believe the error you're getting is because the genbank file you're trying to use has more than one record. compiled_table.py can only take results tables from a single IS query and a single genbank record. So if you've run ISMapper with multiple IS and a multi-entry genbank, when it comes time to compile the results you need to run the command for each IS/reference combination (eg IS1 + chromosome, IS1 + plasmid). There are full details about how to run compiled_table.py and its outputs in the README.

It's unfortunate that I've written it this way, I believe at the time this was the simplest way to implement compiling all the hits together, as the compilation script collapses hits together when they are overlapping, and also determines the genes near to each hit.