Open BirdmanRidesAgain opened 3 years ago
Did you follow Tutorial 3? My guess is that it is due to the headers not being quite right when you run match_contigs_to_probes… but if you got through the first part of Tutorial 3, the headers should be correct. I’m also on vacation at the moment, so have limited ability to check things.
Hi Brant. Let me know if you would like me to make this a separate issue, but I am having a similar problem. I assembled my UCE contigs for each sample using itero, and then put all the contig files in a folder named contigs. I also have the Tetrapods-UCE-5Kv1.fasta probe file.
When I run _phyluce_assembly_match_contigs_toprobes --contigs contigs/ --probes Tetrapods-UCE-5Kv1.fasta --output match/ I get the same error as above. The header lines from my contig fasta files look like _>uce-10_1_length_259_cov8.247788. I do get a lastz and sqlite file for my first sample only. The lastz file has lines like this:
10271 >uce-2_1_length_304_cov_16.118081 + 82 203 121 >uce-2_p1 |source:faircloth,probes-id:2247,probes-locus:2,probes-probe:1 - 0 120 120 ....:....x...................................................................................................x.-......... 111M1D9M 117/120 97.5% 120/121 99.2% 11298 >uce-3_1_length_358_cov_6.458462 + 106 226 120 >uce-3_p1 |source:faircloth,probes-id:9417,probes-locus:3,probes-probe:1 + 0 120 120 ........................................................................................................................ 120M 120/120 100.0% 120/120 100.0%
Do you have any ideas of how I can fix this error?
You should be able to create/edit ~/.phyluce.conf
and add:
[headers]
trinity:comp\d+_c\d+_seq\d+|c\d+_g\d+_i\d+|TR\d+\|c\d+_g\d+_i\d+|TRINITY_DN\d+_c\d+_g\d+_i\d+
velvet:node_\d+
abyss:node_\d+
idba:contig-\d+_\d+
spades:NODE_\d+_length_\d+_cov_\d+.\d+
itero:uce-\d+_length_\d+_cov_\d+.\d+
This is probably the easiest way to fix. Alternatively, you could edit the config file in phyluce, which is nested within your conda environment (on my machine the path to that file is ~/miniconda3/envs/phyluce-1.7.1/phyluce/config/phyluce.conf
).
That worked! I just had to make a little tweak to the expression
itero:uce-\d+_\d+_length_\d+_cov_\d+.\d+
Thanks for your help!
I've been trying to pull UCEs out of a couple of full genome sequences - both a .fna file downloaded from Genbank and song sparrow assembly our lab had on hand. When I run the phyluce_assembly_match_contigs_to_probes command, following the pipeline procedures outlined here (https://phyluce.readthedocs.io/en/latest/daily-use/daily-use-3-uce-processing.html) I consistently get an attribute error.
Do you have any insights as to why that is? I didn't attach the input .fna file in question, as it is very large, but I can do so if that would be helpful.
INPUT CODE
phyluce_assembly_match_contigs_to_probes \ --contigs contigs/ \ # the junco file listed above is in this directory --probes uce-5k-probes.fasta \ --output junco_hyemalis_UCE/ \ --log-path log
OUTPUT
File "/Users/melospiza/miniconda3/envs/phyluce-1.7.1/bin/phyluce_assembly_match_contigs_to_probes", line 421, in
main()
File "/Users/melospiza/miniconda3/envs/phyluce-1.7.1/bin/phyluce_assembly_match_contigs_to_probes", line 354, in main
contig_name = get_contig_name(lz.name1)
File "/Users/melospiza/miniconda3/envs/phyluce-1.7.1/bin/phyluce_assembly_match_contigs_to_probes", line 279, in get_contig_name
return match.groups()[0]
AttributeError: 'NoneType' object has no attribute 'groups'