epi2me-labs / wf-clone-validation

Other
24 stars 18 forks source link

Option for generating multiple reports with subset of samples. #40

Closed gabyrech closed 4 months ago

gabyrech commented 7 months ago

Is your feature related to a problem?

Under a core facility setting, where you have samples from different customers running in the same flowcell, it might be great to have an option for generating separate independent reports, each containing only the set of samples from each customer.

Describe the solution you'd like

For instance, in the case of having one run with 3 samples from User1 with the barcodes 01, 02 and another 3 samples with barcodes 04, 05 and 06 from User2. I would like to have the option for generating 2 reports, one for User1 (with the results of barcodes 01, 02 and 03) another report for User2 (with the results of barcodes 04, 05 and 06).

I think one possibility could be to include the "user" information as a column in the sample_sheet. So the workflow can generate one separate report for each "user" in that column. This approach could also be applicable to other workflows that might face the same issue (e.g. https://github.com/epi2me-labs/wf-amplicon).

Describe alternatives you've considered

Well, I guess the alternative is to manually select the set of samples (barcodes) you want in the report and run the workflow as many times as users you have in the run. Not optimal, but it works.

Additional context

No response

mattdmem commented 7 months ago

Thanks for this suggestion @gabyrech

For some other workflows we produce per sample reports. That’s something we could implement in this workflow.

I also like the idea of being able to identify groups of samples in the sample sheet with an additional column. Our sample sheet processing already allows for the addition of extra columns after those that are required.

You could then either produce grouped reports like you suggested or produce a folder of per sample reports for each group. That could be compressed for easy distribution.

There are a few options here that are worth exploring. We’ll discuss internally and let you know the outcome.

angelovangel commented 7 months ago

Or just run the pipeline in a loop iterating over every user?

gabyrech commented 7 months ago

@angelovangel of course, but... 1) that is not an option for people running the workflow using EPI2ME Desktop GUI. 2) I dont think looping over users is an 'elegant' nor efficient solution for this.

angelovangel commented 7 months ago

For 1. I agree, 2. is debatable I have your use case and am using a simple while loop to generate separate reports by user. Of course it would be nice to handle this in the pipeline, as soon as the sample sheet requirements and handling stay simple.

scottcoutts commented 6 months ago

We also run this as a service in a core, and I agree that it would be useful to be able to process these in the GUI. Currently, we process them using nextflow only for this reason (looping over each sample, and creating an archive per sample). It would be useful to us if the sample sheet configuration included sets of samples per customer/job, and either a report per customer/job, or even just a report per sample.

mattdmem commented 4 months ago

Hello all,

We're busy implementing this right now, it's at the top of my hit list. Progress has been made:

I'll close this issue and please do keep an eye on changelogs for this being propagated to our workflows like this one.

Matt