NAL-i5K / GFF3toolkit

Python programs for processing GFF3 files
Other
95 stars 27 forks source link

AttributeError: 'NoneType' object has no attribute 'groups' #99

Open lijing28101 opened 4 years ago

lijing28101 commented 4 years ago

Hi,

I installed gff3toolkit by pip install, using python 3.5. My code is

/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/bin/gff3_merge -g1 transdecoder_final.gff3 -g2 Zm_B73.gff3 -f Zm-B73-REFERENCE-NAM-5.0.fa -og merged_final+anno.gff3 -r merged_fina_anno_report.txt

It's failed when identify types of replacement based on replace tag.

INFO     Extract sequences from transdecoder_final.gff3...
INFO            Extract CDS sequences...
INFO            Extract premature transcript sequences...
INFO     Extract sequences from Zm_B73.gff3...
INFO            Extract CDS sequences...
INFO            Extract premature transcript sequences...
INFO     Catenate transdecoder_final.gff3 and Zm_B73.gff3...
INFO     Make blastDB for CDS sequences from auto_replace_tag/tmp/gff2_cds.fa...
INFO     Sequence alignment for cds fasta files between transdecoder_final.gff3 and Zm_B73.gff3...
INFO     Find CDS matched pairs between transdecoder_final.gff3 and Zm_B73.gff3...
INFO     Make blastDB for premature transcript sequences from auto_replace_tag/tmp/gff2_pre_trans.fa...
INFO     Sequence alignment for premature transcript fasta files between transdecoder_final.gff3 and Zm_B73.gff3...
INFO     Find premature transcript matched pairs between transdecoder_final.gff3 and Zm_B73.gff3...
INFO     Generate auto_replace_tag/check1.txt for Check Point 1 internal reviewing...
INFO     Reading revision file... (auto_replace_tag/check1.txt)
INFO     Reading gff3 file... (transdecoder_final.gff3)
INFO     Writing summary report (auto_replace_tag/replace_tag_report.txt)...
INFO     Writing revised gff: (auto_replace_tag/Revised_transdecoder_final.gff3)...
INFO     ========== Check whether there are missing replace tags ==========
INFO     - All models have replace tags.
INFO     ========== Merge the two gff files ==========
INFO     Sorting the WA gff by following the order of Scaffold number and coordinates...
INFO     Sorting and printing out...
INFO     Sorting the other gff by following the order of Scaffold number and coordinates...
INFO     Sorting and printing out...
INFO     Reading WA gff3 file...
INFO     Reading the other gff3 file...
INFO     Identifying types of replacement based on replace tag...
Traceback (most recent call last):
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/bin/gff3_merge", line 8, in <module>
    sys.exit(script_main())
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/lib/python3.5/site-packages/gff3tool/bin/gff3_merge.py", line 229, in script_main
    main(args.gff_file1, args.gff_file2, args.fasta, report_fh, args.output_gff, args.all, args.auto_assignment, args.user_defined_file1, args.user_defined_file2, logger=logger_stderr)
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/lib/python3.5/site-packages/gff3tool/bin/gff3_merge.py", line 85, in main
    gff3_merge.merge.main(autoReviseGff, gff_file2, output_gff, report, user_defined1, user_defined2, logger)
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/lib/python3.5/site-packages/gff3tool/lib/gff3_merge/merge.py", line 34, in main
    ReplaceGroups = replace_OGS.Groups(WAgff=gff3, Pgff=gff3M, outsideNum=1, user_defined1=user_defined1, user_defined2=user_defined2, logger=logger_null)
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/lib/python3.5/site-packages/gff3tool/lib/replace_OGS.py", line 253, in __init__
    self.name2id(Pgff, user_defined2)
  File "/work/LAS/mash-lab/jing/bin/Anaconda3/envs/mypy3.5/lib/python3.5/site-packages/gff3tool/lib/replace_OGS.py", line 483, in name2id
    idprefix = tmp.groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'

Please help me to solve the probelm.

Thanks, Jing

mpoelchau commented 4 years ago

Hi Jing,

Sorry you're running into issues. Looks like the program is having issues with ID attributes.

Can you send me the 2 gff3 files and reference fasta file so I can see what's going on (or links to them if they're large)? monica.poelchau@usda.gov

mpoelchau commented 4 years ago

Hi @lijing28101 - the problem you see is occurring because our merge program is making some (possibly unreasonable) assumptions about the way the ID attribute is formatted.

We'll try to work on our end to get rid of that assumption. In the meantime, however, there is a fix - you can use the ancillary script in this repo (lib/gff3_ID_generator.py) to reformat your IDs, then run the merge program again using the output gff3 file(s). Hopefully that will resolve your immediate problem.

I'd suggest running it as follows, using the -idpre and -diglen arguments:

python3 gff3_ID_generator.py -g input.gff3 -og output.gff3 -idpre tmp -diglen 6 -r id-map.txt