The aim of this PR is to make the generation of the libraries.tsv and groups.tsv files easier for the standard use of minute which tends to be always the same for all the FASTQ files provided in a single run. Otherwise, the minute run can be split on different runs.
I added a couple parameters to minute init to simplify the table generation process. The user needs to fill out only a single barcodes table and speficy which FASTQ file pair is the input: --barcodes and --input.
The idea is that the the barcode configuration provided is applied just the same to every pair of FASTQ files that exists in the fastq directory and normalized to the matching input sample specified by --input. This will always normalize to the first line in the --barcodes file provided, and default to pooled samples. This step only generates the libraries.tsv and groups.tsv with this information, the user is then free to further edit this file before running minute run.
I think we developed the software on a more flexible use case, but in practice that almost never happens, and it overcomplicates the setup (and also the workflow itself) of a way simpler use case that is almost always the one we run.
Edit: To highlight that I added a second minute run on the test.sh script. I am aware that this might be undesirable because it will eat up more testing time, so it is up to discussion whether we keep that or just pick one of the use cases for testing on the CI script. If we pick only one, I would go for this last use case because it is more frequent than the custom version with the downloaded already demultiplexed file.
The aim of this PR is to make the generation of the
libraries.tsv
andgroups.tsv
files easier for the standard use ofminute
which tends to be always the same for all the FASTQ files provided in a single run. Otherwise, the minute run can be split on different runs.I added a couple parameters to
minute init
to simplify the table generation process. The user needs to fill out only a single barcodes table and speficy which FASTQ file pair is the input:--barcodes
and--input
.The idea is that the the barcode configuration provided is applied just the same to every pair of FASTQ files that exists in the
fastq
directory and normalized to the matching input sample specified by--input
. This will always normalize to the first line in the--barcodes
file provided, and default topooled
samples. This step only generates thelibraries.tsv
andgroups.tsv
with this information, the user is then free to further edit this file before runningminute run
.I think we developed the software on a more flexible use case, but in practice that almost never happens, and it overcomplicates the setup (and also the workflow itself) of a way simpler use case that is almost always the one we run.
Edit: To highlight that I added a second minute run on the
test.sh
script. I am aware that this might be undesirable because it will eat up more testing time, so it is up to discussion whether we keep that or just pick one of the use cases for testing on the CI script. If we pick only one, I would go for this last use case because it is more frequent than the custom version with the downloaded already demultiplexed file.