Closed lxsteiner closed 1 week ago
@lxsteiner - Thanks for the feedback, this is a great idea and very much doable. I will add a proper feature within the next week. For an immediate workaround, the mock up ab1 quality values as you mentioned would work. Just add your sequences in an exisiting isoQC formatted file, then add mock data for missing columns, and you should be able to continue onward to the isoTAX > isoLIB steps as usual.
Hi @lxsteiner, just following up on this. I've adjusted the isoTAX
function in the latest isolateR package release to allow for input of FASTA files, as requested. Brief overview as follows:
isoTAX
from L85-L141, which can be inspected hereif ("package:isolateR" %in% search()) {detach("package:isolateR", unload=TRUE)}
devtools::install_github("bdaisley/isolateR")
Manual download link for FASTA example: human_gut_isolates_10.fasta
#Download example FASTA file:
download.file("https://github.com/bdaisley/isolateR/raw/main/inst/extdata/fasta_examples/human_gut_isolates_10.fasta",
destfile="T:/human_gut_isolates_10.fasta")
#Run isoTAX with FASTA file as input (Note: 'quick_search=FALSE' recommended for real use scenario)
isoTAX(input="T:/human_gut_isolates_10.fasta", quick_search=TRUE)
The above commands will generate the following output files:
02_isoTAX_results.html (screenshot below)
Manual changes can be incorporated into the mock isoQC table and then re-run with isoTAX. This may be desirable if you want to add custom quality values or other metadata not directly accessible from a raw FASTA file.
isoTAX(input="T:/isolateR_output/01_isoQC_mock_table.csv", quick_search=TRUE)
If nothing was edited in the isoQC mock table, this last line of code will functionally lead to the same output as with using the FASTA file directly.
I hope these additions are helpful. Please let me know if any further adjustments are needed!
Hi,
This looks like such a useful wrapper for handling large collections of Sanger sequences, thank you for publishing it!
I've read the tutorial and manual, but was wondering if it would be possible to pass entries made only from FASTA sequences (and not .ab1 files) as an output of
isoQC
intoisoTAX
? Or what could a possible workaround be to still input samples where only FASTA sequences exist (e.g. make up mock ab1 quality values, make them into .ab1 files, and process it inisoQC
)?The motivation being, that in-house we of course have .ab1 files from which FASTA sequences were eventually extracted and worked with for tax. identification and etc. But in order to have sequences from other labs used in the same collection/pipeline (e.g. for taxonomic identity), only FASTA sequences are usually available and made public.
It would be great if this were possible all within
isolateR
, otherwise it's again a chore to process own samples with .ab1 files here, make collections, export FASTA, add external FASTA collections, redo taxonomic identifications with whatever tool, summarize taxonomy on your own.Do you see any possible workaround at the moment or possibly implementing a similar feature in the future?
Thanks.