hsmurali / SCRAPT

SCRAPT: An Iterative Algorithm for Clustering Large 16S rRNA Gene Datasets
7 stars 1 forks source link

Available as bioconda package? #1

Open d4straub opened 1 year ago

d4straub commented 1 year ago

Hi there,

I browsed your interesting paper (that isnt mentioned in the README) and was intrigued by the speed improvement while promising a similar performance compared to DADA2. I'd like to ask whether you are going to add the tool to bioconda, that would make it available as conda package and container (bioconda packages are also available as docker & singularity containers). Packaging & containerization would make the tool even more useful by using the package/container instead of going through an installation procedure. This allows efficient use in pipelines and makes analyses reproducible.

Small side questions: the README states

  -f FILEPATH, --filepath FILEPATH
                        Path to the file containing 16S sequences. At the
                        moment we support only fatsa file containing the 16S
                        reads.

Best, Daniel

hsmurali commented 1 year ago

Thank you! We don't have it on bioconda yet. We will seriously consider making it available as a bioconda package.

  1. The current version of SCRAPT takes as input a fasta file since it assumes the sequences are preprocessed and quality filtered. We recommend using QIIME2 to preprocess sequences from all samples and multiplexing reads that pass the quality filter into a single fasta file.

  2. SCRAPT takes sequences pooled from all samples as input. This is similar to the pooled mode of DADA2. We plan to provide a pipeline to perform preprocessing in the future iterations of SCRAPT. We also plan on providing a sample x OTU table that can be used for differential abundance testing in the future.

  3. SCRAPT can be applied on any amplicon sequencing dataset and not just 16S rRNA. In the paper we present some results on 18S rRNA gene, which is a marker found in microeukaryotes.

Please feel free to reach out, if you have further queries on preprocessing raw read files.

Thank you.