Closed mhoban closed 1 year ago
I think this can be done with a CSV file that maybe looks something like this:
sample_id | file_pattern |
---|---|
sample1 | file1_nonsensefff{R1,R2}.fastq |
sample2 | file2_nonsensefff{R1,R2}.fastq |
sample3 | file3_nonsensefff{R1,R2}.fastq |
Then right after the reads are loaded, we just go through and reinterpret the sample ID part of the reads tuple using this CSV map. Is it easy to load and use CSV files? Maybe I need an external script. Actually I can probably do something clever with awk (viz. making an associative array) and we can just use a two column tab-separated file (with no headers) rather than an actual CSV
This is now implemented in 5b14cdfd1af5ca87032fc91c28c8e63df94d9c92
Right now, the sample IDs come from fromFilePairs, anything before the R1/R2 in the file. Figure out how to let the user customize this somehow. Probably give the option to pass a regex of some sort.