Closed GabiCamacho closed 3 years ago
This is definitely due to the contig naming scheme that looks like 3_0.277257%_cov_268 len=1326
- this format is not expected by phyluce (and it is also not produced by phyluce), and if you would like to use these data (without changing the contig headers), you will need to adjust the regular expressions that phyluce is using to identify contigs (these may be adjusted in the phyluce configruation file - which you can create at ~/.phyluce/config
.
One other quick note - it may be easiest/easier to just re-assemble data for those contigs with the weird headers (that's a really odd format).
Thank you very much Brant!
Gabi Camacho Postdoctoral Fellow Pronouns: she/her/hers California Academy of Sciences T 415.379.5309 gcamacho@calacademy.org
55 Music Concourse Drive Golden Gate Park San Francisco, CA 94118 www.calacademy.org
The mission of the California Academy of Sciences
is to explore, explain, and sustain life.
Learn more https://www.calacademy.org/ about our work. Facebook http://www.facebook.com/calacademy | Twitter https://twitter.com/calacademy | Instagram https://www.instagram.com/calacademy/
On Wed, Sep 23, 2020 at 6:53 AM Brant Faircloth notifications@github.com wrote:
One other quick note - it may be easiest/easier to just re-assemble data for those contigs with the weird headers (that's a really odd format).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/faircloth-lab/phyluce/issues/202#issuecomment-697389685, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGBXNWYLKHOZCLKIHZIRAE3SHH4WJANCNFSM4RWNN7EA .
Dear,
I'm having problems while trying to run phyluce_assembly_match_contigs_to_probes on my data, giving the the following error:
19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - ======= Starting phyluce_assembly_match_contigs_to_probes ======= 2020-09-22 19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - Version: git fatal: Not a git repository: '/home/bonnie/anaconda3/envs/py27/lib/python2.7/site-packages/.git' 2020-09-22 19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --contigs: /media/bonnie/Data_Drive/others/Heteroponerinae-contigs 2020-09-22 19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --dupefile: None 2020-09-22 19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --keep_duplicates: None 2020-09-22 19:29:57,008 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --log_path: /media/bonnie/Data_Drive/others/log-heteroponerinae 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --min_coverage: 50 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --min_identity: 80 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --output: /media/bonnie/Data_Drive/others/uce-search-cov50-heteroponerinae 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --probes: /media/bonnie/Data_Drive/others/hym-probes-v2-ant-specific-uce-only.fasta 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --regex: ^(uce-\d+)(?:_p\d+.*) 2020-09-22 19:29:57,009 - phyluce_assembly_match_contigs_to_probes - INFO - Argument --verbosity: INFO 2020-09-22 19:29:57,151 - phyluce_assembly_match_contigs_to_probes - INFO - Creating the UCE-match database 2020-09-22 19:29:57,325 - phyluce_assembly_match_contigs_to_probes - INFO - Processing contig data 2020-09-22 19:29:57,326 - phyluce_assembly_match_contigs_to_probes - INFO - ----------------------------------------------------------------- 2020-09-22 19:30:26,227 - phyluce_assembly_match_contigs_to_probes - INFO - ECTA07: 2332 (3.55%) uniques of 65683 contigs, 0 dupe probe matches, 70 UCE loci removed for matching multiple contigs, 22 contigs removed for matching multiple UCE loci 2020-09-22 19:30:44,493 - phyluce_assembly_match_contigs_to_probes - INFO - GAB75: 2261 (8.39%) uniques of 26951 contigs, 0 dupe probe matches, 43 UCE loci removed for matching multiple contigs, 12 contigs removed for matching multiple UCE loci 2020-09-22 19:31:06,975 - phyluce_assembly_match_contigs_to_probes - INFO - GPC01: 2325 (5.59%) uniques of 41576 contigs, 0 dupe probe matches, 73 UCE loci removed for matching multiple contigs, 15 contigs removed for matching multiple UCE loci 2020-09-22 19:31:22,458 - phyluce_assembly_match_contigs_to_probes - INFO - GPC02: 2096 (9.62%) uniques of 21791 contigs, 0 dupe probe matches, 70 UCE loci removed for matching multiple contigs, 2 contigs removed for matching multiple UCE loci 2020-09-22 19:44:30,714 - phyluce_assembly_match_contigs_to_probes - INFO - RHY06: 2300 (6.48%) uniques of 35473 contigs, 0 dupe probe matches, 86 UCE loci removed for matching multiple contigs, 18 contigs removed for matching multiple UCE loci Traceback (most recent call last): File "/home/bonnie/anaconda3/envs/py27/bin/phyluce_assembly_match_contigs_to_probes", line 342, in
main()
File "/home/bonnie/anaconda3/envs/py27/bin/phyluce_assembly_match_contigs_to_probes", line 289, in main
for lz in lastz.Reader(output):
File "/home/bonnie/anaconda3/envs/py27/lib/python2.7/site-packages/phyluce/lastz.py", line 119, in iter
yield self.next()
File "/home/bonnie/anaconda3/envs/py27/lib/python2.7/site-packages/phyluce/lastz.py", line 140, in next
lastz_result_split[k] = float(v.strip('%'))
ValueError: could not convert string to float: >1723_0.00852258%_cov_13 len=856
I believe the problem is due to the fact that I have two sets of contigs with two different headings:
Set 1 runs fine, with the following headings..
Set 2 won't run, neither together with other contigs or separately, and it has the headings as following:
Could you help me solve this problem, please?
Thank you in advance.