jbloomlab / barcoded_flu_pdmH1N1

Barcoded pdmH1N1 virus hashing experiment
5 stars 1 forks source link

sample names for progeny production pilot #43

Open jbloom opened 3 years ago

jbloom commented 3 years ago

@dbacsik says:

N.b. Also note that some of the samples in this pilot were assigned to the wrong sequencing indices, and this notebook manually corrects this in the results folder. This needs to be revised so that the notebook can be run automatically.

jbloom commented 3 years ago

@dbacsik, you made a correction to the sample indices in this commit a4cca72f304. But that was to the now deleted data/experiments_config.yml, not to config.yaml where configuration now lives.

Where those changes designed to fix this issue or some other problem? In any case, you should make these same changes to where the experimental configuration currently lives in config.yaml. But how about wait until we get new_refactor_pipeline merged in main first, and then you can do this as separate pull request? (So hold on just a bit.)

dbacsik commented 3 years ago

These are two different errors.

The ones I fixed in commit a4cca72 were simple typos. I will repeat these corrections in config.yml later.

The issue started here (#43) references a bigger problem, which is that the barcode sequencing samples were mis-labelled with the wrong indices when the sequencing was submitted. This will need to be remedied by either:

  1. Replacing the SampleSheet.csv file in the sequencing run's folder with a corrected copy, or
  2. Adding a step to de-multiplex barcode sequencing data from their original BCL files to the Snakemake pipeline.
dbacsik commented 3 years ago

For the time being, I am going to leave this pilot branch alone. As this issue describes, the FASTQ files are manually demuxed in the project folder, rather than on the sequencing server, because I needed to correct the sample-index pairs.

The bulk progeny production pilot experiment is analyzed on the progeny_pilot_v2 branch. This pilot has served its purpose for now, but may be referenced as we build out the viral barcode analysis. Please do not delete this branch.

jbloom commented 3 years ago

@dbacsik, I'm re-opening this issue. Everytime you close an issue (unless we decide that it really doesn't matter at all), that should be associated with a pull request into main that addresses the issue.

As far as I can tell, we still don't have a pull request into main that addresses the issue here (correct specification of these data), so this issue should stay open. I think your proposed solution of providing specifications to FASTQs that are manually de-multiplexed correctly is probably fine, but there isn't any pull request to master that provides any of these specifications so this issue should stay open.

dbacsik commented 3 years ago

Roger that!