lauramilena3 / On-rep-seq

Bulk Typing of Bacterial Species down to Strain Level
MIT License
4 stars 3 forks source link

Long computation time processing missing barcode samples #7

Open AlexGaithuma opened 4 years ago

AlexGaithuma commented 4 years ago

Hi. I am not a programming expert so I dont know the finer details of coding but when running the pipeline, I noticed that it takes quite some time to process extra barcode files with 0 bytes (empty) eg. my data has 96 samples repBC01-repBC96 so repBC97-repBC192 barcode files are empty. The pipeline takes time to run through the empty 100 files especially loading database information during taxonomic assignment takes the bulk of the time. It would save lots of time to skip the pipeline for empty dimultiplexed fastqs.

Thanks for availing this great tool.

lauramilena3 commented 4 years ago

Thank you for your comment Alex. I will fix the code to address this issue.

AlexGaithuma commented 4 years ago

In the mean time I worked around this by editing the config.yaml file and removed the barcodes not in my data.