Clinical-Genomics / demultiplexing

To keep scripts associated with execution of the Illumina demultiplexing pipeline
5 stars 0 forks source link

X hasta #82

Closed ingkebil closed 5 years ago

ingkebil commented 5 years ago

This PR rewrites the demux of hiseqx to use bcl2fastq 2.20 as well as a using a shared file system on a HPC.

This is a major:

See doc 1992:3 for test description and execution.

ingkebil commented 5 years ago

Nah, I don't need the approval to merge yet. I figured if anyone wanted to keep on working on it while I'm holidaying, then they could.

It isn't finished yet. Almost tho! Almost!

ingkebil commented 5 years ago

Alright. This works now.

What is not tested: adding of stats! (since this is not in thie repo)

emiliaol commented 5 years ago

Cool, I'll test it on a FC that I need demuxed :)

emiliaol commented 5 years ago

Testing:

Used run 181130_ST-E00269_0322_AHVCFMCCXY:

  1. Copied the run to /home/proj/stage/flowcells/hiseqx/
  2. Made sure the demuxstarted.txt file wasn't in the run folder
  3. cloned the repo as hiseq.clinical to ~/git/
  4. Used bash ~/git/demulitplexing/scripts/xcheckfornewrun.bash /home/proj/stage/flowcells/hiseqx/ /home/proj/stage/demultiplexed-runs/

Skärmavbild 2019-06-25 kl  13 50 14

First step worked fine. The jobs are submitted, waiting for them to run. See the logfile: projectlog.20190625134226.log

Failed because I had the wrong environment activated. Trying again with stage.

emiliaol commented 5 years ago

Tested again as hiseq.clinical. Had to make a small change to xdemuxtiles.bash where I set SLURM_ACCOUNT=production instead of development as hiseq.clinical cannot start jobs using development account.

In the stage environment was activated and the following command was run: Skärmavbild 2019-06-25 kl  16 23 49

The jobs were submitted Skärmavbild 2019-06-25 kl  16 22 28

The file structure was created: Skärmavbild 2019-06-25 kl  17 21 37 Skärmavbild 2019-06-25 kl  17 22 09

And the fastq-files have been created with the correct naming and correct sizes: Skärmavbild 2019-06-25 kl  17 28 02

The fastq-file content is also correct according to specifications: Skärmavbild 2019-06-25 kl  17 29 13

emiliaol commented 5 years ago

For good measure I tested the stats adding using the current master of deliver. I added the copycomplete.txt to be able to pick the run up using checkfornewdemuxonhasta.bash: Skärmavbild 2019-06-25 kl  18 02 40

The stats files were created in the correct place: Skärmavbild 2019-06-25 kl  18 03 46

And they look as expected. Skärmavbild 2019-06-25 kl  18 06 36

For some reason cg transfer flowcell does not work in the script: Skärmavbild 2019-06-25 kl  18 10 05

But it works using the CLI directly: Skärmavbild 2019-06-25 kl  18 10 39

So to do:

ingkebil commented 5 years ago

Nice going with the testing! :strawberry:

So, the cg transfer or anything cg won't work because checkfornewdemuxonhasta.sh doesn't source any environment. We will have to move cg from being an alias to a bash function to solve this problem. It's something trivial to implement but might be a lot of commands to test. Namely, all crontabs.

emiliaol commented 5 years ago

Ah, that makes sense. Can we for now add the sourcing so that we can get this in production? And then do the bash function as a separate thing?

emiliaol commented 5 years ago

@ingkebil @barrystokman So I started thinking about this and the scripts/hiseqx/xpostface.batch should add the copycomplete.txt so the deliver part can work. What triggers that? Should it be part of the demux process or something we will have to trigger separately? It did not happen during my test. I saw you activated prod as part of the crontab entry for something else. Would that be an option here too? So that we don't actually have to change anything in deliver?