Open charbeez opened 9 months ago
Howdy,
you can do either. meaning you can modify the headers to appear like what is expected for spades or trinity (but not both), or you can integrate two databases into one - see the "Incorporating Outgroup Data" sections starting here. You use both databases for this step, and the next step, then you should be good to go.
Basically, you'll treat one or the other data sources as "outgroup data".
Thank you for getting back to me so quickly! I tried the above incorporating outgroup/other data with:
phyluce_assembly_get_match_counts \ --locus-db /fs/scratch/PAS1918/CB_UCE_NMNH/Final_tree_MOO+transcriptomes/MOOs+ZOs/uce-search-results/probe.matches.sqlite \ --taxon-list-config /fs/scratch/PAS1918/CB_UCE_NMNH/Final_tree_MOO+transcriptomes/taxon-list.conf \ --taxon-group 'dataset' \ --extend-locus-db /fs/scratch/PAS1918/CB_UCE_NMNH/Final_tree_MOO+transcriptomes/Transcriptomes/uce-search-results/probe.matches.sqlite \ --output /fs/scratch/PAS1918/CB_UCE_NMNH/Final_tree_MOO+transcriptomes/dataset.conf
and am coming up with another error- I've copied the final few lines below. It looks like this ran successfully using both separate probe.matches.sqlite, however every sequence failed to detect any UCE loci. Do you have any suggestions on how to proceed?
2023-09-29 11:24:23,630 - phyluce_assembly_get_match_counts - INFO - Failed to detect 1373 UCE loci in MOO_53_16 2023-09-29 11:24:23,630 - phyluce_assembly_get_match_counts - INFO - Failed to detect 1363 UCE loci in MOO_52_14 2023-09-29 11:24:23,630 - phyluce_assembly_get_match_counts - INFO - Failed to detect 1352 UCE loci in MOO_46_14 2023-09-29 11:24:23,632 - phyluce_assembly_get_match_counts - INFO - Writing the taxa and loci in the data matrix to /fs/scratch/PAS1918/CB_UCE_NMNH/Final_tree_MOO+transcriptomes/dataset.conf 2023-09-29 11:24:23,633 - phyluce_assembly_get_match_counts - INFO - ========== Completed phyluce_assembly_getmatch counts ==========
And my taxon-list.conf file begins with [dataset] with the second set of transcriptome-sourced data (in the extend-locus-db) were denoted with an asterisk following each
Thanks again for your help!
Hi Brant, I'm having trouble running phyluce_assembly_match_contigs_to_probes on assemblies from both the SPADES and the TRINITY. The headers for each are accepted when run independently and both datasets work if run separately, but then generate subsequent separate probes.matches.sqlite databases. Is there a way to either 1) combine the SPADES and TRINITY datasets to have phyluce read them together (like maybe altering the headers of one or the other type of file) and produce one sqlite database, or 2) to merge the two sqlite databases for processing downstream?
My headers for each fasta type, for reference. Both appear to fit the regex in the /.phyluce/config, and are processed without a problem independently, but I'd like to incorporate and process these data together.
Thanks!