faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

return match.groups()[0] AttributeError: 'NoneType' object has no attribute 'groups' #321

Closed 51mystic closed 7 months ago

51mystic commented 7 months ago

Dear teacher, I assembled the whole genome of scale insects with megahit to obtain contigs, copied the contigs of all species into a folder, and used the probe designed by my lab to match them. The command is as follows: phyluce_assembly_match_contigs_to_probes --contigs "/mnt/data/userdata/svip019/00----outcome/uce-o/uce-rowdata/" --probes /mnt/data/userdata/svip019/probes/jiezongke_tanzhen.fasta --output /mnt/data/userdata/svip019/00----outcome/uce-o/uce-matchdata

The configuration is as follows: 2023-11-06 21:24:06,119 - phyluce_assembly_match_contigs_to_probes - INFO - ======= Starting phyluce_assembly_match_contigs_to_probes ======= 2023-11-06 21:24:06,120 - phyluce_assembly_match_contigs_to_probes - INFO - Version: 1.7.2 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Commit: None 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --contigs: /mnt/data/userdata/svip019/00----outcome/uce-o/uce-rowdata 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --csv: None 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --dupefile: None 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --keep_duplicates: None 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --log_path: None 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --min_coverage: 80 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --min_identity: 80 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --output: /mnt/data/userdata/svip019/00----outcome/uce-o/uce-matchdata 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --probes: /mnt/data/userdata/svip019/probes/jiezongke_tanzhen.fasta 2023-11-06 21:24:06,121 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --regex: ^(uce-\d+)(?:_p\d+.*) 2023-11-06 21:24:06,122 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --verbosity: INFO 2023-11-06 21:24:06,838 - phyluce_assembly_match_contigs_to_probes - INFO - Creating the UCE-match database 2023-11-06 21:24:06,911 - phyluce_assembly_match_contigs_to_probes - INFO - Processing contig data 2023-11-06 21:24:06,915 - phyluce_assembly_match_contigs_to_probes - INFO - -----------------------------------------------------------------

Error reported as follows: Traceback (most recent call last): File "/mnt/data/userdata/svip019/anaconda3/envs/phyluce-1.7.2/bin/phyluce_assembly_match_contigs_to_probes", line 421, in main() File "/mnt/data/userdata/svip019/anaconda3/envs/phyluce-1.7.2/bin/phyluce_assembly_match_contigs_to_probes", line 354, in main contig_name = get_contig_name(lz.name1) File "/mnt/data/userdata/svip019/anaconda3/envs/phyluce-1.7.2/bin/phyluce_assembly_match_contigs_to_probes", line 279, in get_contig_name return match.groups()[0] AttributeError: 'NoneType' object has no attribute 'groups'

How can I solve this problem?Looking forward to your answer.

brantfaircloth commented 7 months ago

phyluce expects several different types of contig naming schemes - e.g. those output by the assembly programs used in phyluce. Your contigs likely do not follow those naming schemes, causing the problem. You can either edit the default config file and add a regular expression for the naming scheme used for the contigs you have assembled (https://github.com/faircloth-lab/phyluce/blob/main/config/phyluce.conf), or you can create a copy of this file at ~/.phyluce.conf and edit the file in that location. There are several existing regular expressions there you could use as guidance (but what you are trying to do it not a standard part of the phyluce pipeline).

51mystic commented 7 months ago

Dear Teacher, I received your reply and looked at ### phyluce.conf, and there is no mention of changes to ### megahit.

k141_27374 flag=1 multi=2.0000 len=302 TCTACTCCGATACTATCTACAAATTCCGGACCTTTTCGATCTC

[binaries] abyss:$CONDA/bin/ABYSS abyss-pe:$CONDA/bin/abyss-pe bcftools:$CONDA/bin/bcftools bedtools:$CONDA/bin/bedtools bwa:$CONDA/bin/bwa gblocks:$CONDA/bin/Gblocks lastz:$CONDA/bin/lastz mafft:$CONDA/bin/mafft muscle:$CONDA/bin/muscle pilon:$CONDA/bin/pilon raxml-ng:$CONDA/bin/raxml-ng samtools:$CONDA/bin/samtools seqtk:$CONDA/bin/seqtk spades:$CONDA/bin/spades.py trimal:$CONDA/bin/trimal velvetg:$CONDA/bin/velvetg velveth:$CONDA/bin/velveth snakemake:$CONDA/bin/Snakemake

[workflows] mapping:$WORKFLOWS/mapping/Snakefile correction:$WORKFLOWS/contig-correction/Snakefile phasing:$WORKFLOWS/phasing/Snakefile

----------------

Advanced

----------------

[headers] trinity:comp\d+_c\d+_seq\d+|c\d+_g\d+_i\d+|TR\d+|c\d+_g\d+_i\d+|TRINITY_DN\d+_c\d+_g\d+i\d+ velvet:node\d+ abyss:node\d+ idba:contig-\d+\d+ spades:NODE_\d+length\d+cov\d+.\d+

[spades] cov_cutoff:5

For the assembly result example of megahit above, can you give me some detailed workarounds? The students are very grateful. @brantfaircloth

brantfaircloth commented 7 months ago

If you add something like: megahit:k\d+_\d+ To the [headers] section, that might fix the problem.

51mystic commented 7 months ago

Dear teacher: Thank you for your guidance: the results have been successfully thrown. @brantfaircloth