faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

phyluce_assembly_match_contigs_to_probes - AttributeError: 'NoneType' object has no attribute 'groups'-NEW Issues #337

Closed jjinzhou closed 2 months ago

jjinzhou commented 2 months ago

Dear Professor Faircloth,

I'm having a problem running this command"phyluce_assembly_match_contigs_to_probes", which seems to be the same problem as #233 and #321. I modified the ~/.phyluce.conf in conda environment according to your answers to these two questions, i.e

[headers] trinity:comp\d+_c\d+_seq\d+|c\d+_g\d+_i\d+|TR\d+|c\d+_g\d+_i\d+|TRINITY_DN\d+_c\d+_g\d+i\d+ velvet:node\d+ abyss:node\d+ idba:contig-\d+\d+ spades:NODE_\d+length\d+cov\d+.\d+ minia3:scaffold_\d+uid\d+

I made this change because the assembly software was selected Minia3 and the file starts with scaffold_41_uid_1710815508.But the results still show the same error. I then tried the --regex scaffold_\d+_uid_\d+ parameter, but the results were still wrong.

I don't know what went wrong and I was hoping for your guidance. Here is the code I used, and the specific error

phyluce_assembly_match_contigs_to_probes \ --contigs assemblies_endding \ --probes hymenoptera-v2-PRINCIPAL-bait-set.fasta \ --output uce_extraction \ --min-coverage 67 \ --min-identity 80

File "~/miniconda3/envs/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 421, in <module> main() \n File "~/miniconda3/envs/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 354, in main contig_name = get_contig_name(lz.name1) File "~/miniconda3/envs/phyluce-1.7.3/bin/phyluce_assembly_match_contigs_to_probes", line 279, in get_contig_name return match.groups()[0] AttributeError: 'NoneType' object has no attribute 'groups'

brantfaircloth commented 2 months ago

The header string you are using looks incorrect. It should be something like:

minia3:scaffold_\d+_uid_\d+

to match a header name like:

scaffold_41_uid_1710815508

The --regex command won't work - that's meant for something else.

jjinzhou commented 2 months ago

Thank you so much for your quick reply! But sorry, I think it was a formatting error caused by pasting. That's exactly the format I'm using. But the mistake still occurs

截屏2024-04-23 10 50 38

Lzhensectabl:phyLuce_assembLy_match_contigs_to_probes -contigs assemblies_endding\ -probes hymenoptera-v2-PRINCIPAL-bait-set. fasta\ --output uce_extraction \ -min-coverage 67 \ --min-identity80 2024-04-23 10:33:46,824 - phyluce_assembly_match_contigs_to_probes - INFO ======= Starting phyluce_assembly_match_contigs_to_probes 2024-04-23 10:33:46,824 - phyluce_assembly_match_contigs_to_probes - INFO

brantfaircloth commented 2 months ago

I'm not sure - Minia is not a supported assembler, so what you are trying to do is non-standard. If I had to guess, it would be that some scaffolds/contigs in the output have a different header format than the one you think they have. But I don't use Minia, so I'm not sure what the header format looks like for all scaffolds/contigs.

jjinzhou commented 2 months ago

Thank you very much for your help. I have re-checked the header format of each sample, and it is indeed the same format. But I didn't have a lot of samples so I decided to reassemble them with spades to avoid this problem. Wish you have a good day!