Metadata file without barcode or adapter

AusARG / pipesnake

ausarg/pipesnake is a bioinformatics best-practice analysis pipeline for phylogenomic reconstruction starting from short-read 'second-generation' sequencing data.

MIT License

7 stars 2 forks source link

Metadata file without barcode or adapter #7

Closed Phismil closed 7 months ago

Phismil commented 8 months ago

Description of feature

Dear developers Thank you for the this exteremly useful pipeline. I am wondering whether it is possible to run the pipeline using data that have already been debarcoded and adapter-trimmed? Thank you in advance

IanGBrennan commented 8 months ago

Hey there, As currently implemented I don't think it can be run without providing sequences for adapters/indexes because the workflow passes data through TRIMMOMATIC, without a way to bypass. But, I think it's possible we could make some minor adjustments to use the --stage option to start the pipeline from just before the assembly stage, and then continue on. This would allow passing pipesnake debarcoded/adapter-trimmed reads.

Is pipesnake something you're likely to use if we are able to implement this? From my perspective, it's always nice to make our workflow more flexible. Cheers.

Phismil commented 8 months ago

Dear Ian Thank you for your kind response. Yes, we intend to use the pipeline intensively. I think since most of us these days recieve demultiplexed and most of the time quality-filtered data from genomics cores, or alternatively, download demultiplexed data from NCBI SRA database, it will be a significant enhancement for the pipeline to start from such data with --stage option. Cheers

ziadbkh commented 7 months ago

This is should be fine. Hopefully, it will be available next week.

ziadbkh commented 7 months ago

@Phismil Try workflow in the branch v1.1 as added a solution for your case: You can use the parameter --disable_adapter_trimming to skip the adapter trimming. Use the same input samplesheet and ignore barcodes and adapter columns. Please let us know how you go with your testing so we can close the issue and merge the new version.

IanGBrennan commented 7 months ago

just to specify @Phismil, when you pull pipesnake use the -r command to grab v1.1

nextflow pull ausarg/pipesnake -r v1.1 ...

when you run the pipeline you'll do the same

nextflow run ausarg/pipesnake -r v1.1 ...

Once we're happy with that we can merge these changes into the main. I've tested it out on some cleaned reads, and it appears to work smoothly. Please let us know if you run into further issues.

Phismil commented 7 months ago

Dear Ian and Ziad Thank you for considering my suggestion in such a short time. I tried the new option on multiple data sets and I was able to successfully assembled the desired loci across different taxa. I also tried it on a few NCBI data sets and it ran without any issue. Cheers

ziadbkh commented 7 months ago

resolved in #9