Closed Fattigman closed 6 months ago
This is the command used by Yggdrasil:
bcl-convert --bcl-input-directory 230616_A00681_0881_BH3MV7DSX7 --output-directory . --force --sample-sheet CTG_SampleSheet_S4_230615.csv --bcl-sampleproject-subdirectories true --strict-mode true --bcl-only-matched-reads true --bcl-num-parallel-tiles 16
For some reason neither the path of the samplesheet or runfolder gets symlinked to the workdir. Furthermore no output can be found in the workdir either.
workdir: /projects/fs1/shared/Nextflow/b0/492d59b5ce54a5c014ebae9e979dc2
Running with stub just copies the whole flowcell into the workdir?!
/projects/fs1/shared/external-tools/nextflow/latest/nextflow -bg run /projects/fs1/shared/Development_Github/Yggdrasil/main.nf --samplesheet /projects/fs1/nas-sync/upload/230616_A00681_0881_BH3MV7DSX7/CTG_SampleSheet_S4_230615.csv --rawdata /projects/fs1/nas-sync/upload/230616_A00681_0881_BH3MV7DSX7 --outdir /projects/fs1/shared/Jobs/ -profile ctg -stub > test.log
I won't continue debug this until @lokeshbio or @chaetognatha can provide a working example.
I think the first thing to try would be to rebuild the bcl-convert image and put it in the correct location under shared/containers and then test that directly, I'll see if I have time to do that today
I think the problem lies more with the nextflow code. As I stated earlier:
For some reason neither the path of the samplesheet or runfolder gets symlinked to the workdir.
The bclconvert process dumps the data elsewhere than the workdir, where it should land.
It seems to me there is a weird interaction between nextflow and bclconvert, unless I can get a working example.
However, I am all for cleaning up the container directory!
You still may be right, but I found for example that we are using Test_Jobs as the default root for new jobs and for containers, which is incorrect, I would like our future container path to be Shared/Containers but the legacy place for all the containers is shared/ctg-containers and changing that would break legacy and we might as well wait until we get COSMOS-SENS and can plan everything from the ground up, the expectation is that legacy wont work there anyways.
I am working on correcting these paths and working through the config atm and am hoping to be done in a little bit, then I will move on and update here as I go!
Sounds good!
The project you tried initially has now been running for a while, but I realize it is rather big so I also started a run using only the minimal test data project.
seems to work fine, looks like the multiqc image that I added is broken, so I will update that one and retest @Fattigman
Now it crashed again for larger 230616
Initial crash reason: Exception thrown in ../src/host/dragen_api/file_io/async_io.cpp line 765 -- Wrote 524288 bytes instead of expected 1048576 bytes. Check disk space. Aborted on 6th partially-complete I/O. i=0
Exception thrown in ../src/host/dragen_api/file_io/async_io.cpp line 765 -- Wrote 581632 bytes instead of expected 1048576 bytes. Check disk space. Aborted on 6th partially-complete I/O. i=0
Dumping diagnostics....
Sample sheet being processed by common lib? Yes
SampleSheet Settings:
CreateFastqForIndexReads = 0
shared-thread-linux-native-asio output is disabled
bcl-convert Version 00.000.000.4.0.3
Copyright (c) 2014-2022 Illumina, Inc.
Command Line: --bcl-input-directory 230616_A00681_0881_BH3MV7DSX7 --output-directory . --force --sample-sheet CTG_SampleSheet_S4_230615.csv --bcl-sampleproject-subdirectories true --strict-mode true --bcl-only-matched-reads true --bcl-num-parallel-tiles 16
Conversion Begins.
# CPU hw threads available: 20
Parallel Tiles: 16. Threads Per Tile: 1
SW compressors: 20
SW decompressors: 10
SW FASTQ compression level: 1
WARNING: Could not write replay file /var/log/bcl-convert/dragen_replay_1687350251432_171770.json: /var/log/bcl-convert/dragen_replay_1687350251432_171770.json: cannot open file
DRAGEN replay file saved to /var/log/bcl-convert/dragen_replay_1687350251432_171770.json
sh: 1: cannot create /var/log/bcl-convert/dragen_info_1687350251432_171770.log: Directory nonexistent
DRAGEN registers saved to /var/log/bcl-convert/dragen_info_1687350251432_171770.log
Hang diagnostic saved to /var/log/bcl-convert/hang_diag_1687350251432_171770.txt
sh: 1: cannot create /var/log/bcl-convert/pstack_1687350251508_171770.log: Directory nonexistent
pstack saved to /var/log/bcl-convert/pstack_1687350251508_171770.log
/bin/sh: 1: terminate called after throwing an instance of 'cannot create /var/log/bcl-convert/pstack_1687350251508_171770.log: Directory nonexistent
AioReturnFailed'
terminate called recursively
/projects/fs1/shared/Nextflow/a6/1e0022ebbffa8b10cc860a6b50ae80/.command.sh: line 2: 171770 Aborted (core dumped) bcl-convert --bcl-input-directory 230616_A00681_0881_BH3MV7DSX7 --output-directory . --force --sample-sheet CTG_SampleSheet_S4_230615.csv --bcl-sampleproject-subdirectories true --strict-mode true --bcl-only-matched-reads true --bcl-num-parallel-tiles 16```
Again, comparing with the 221111 run, we can see that the data has been correctly set up with symlinks.
But there are no symlinks for the 230616 run.
I can't make any sense out of it.
very strange,
to replicate this working result just do:
nextflow run Yggdrasil/ --rawdata Test_Jobs/Test_Data/SeqOnly/221111_VH00947_17_AACGHWHM5/ --samplesheet Test_Jobs/Test_Data/SeqOnly/CTG_SampleSheet.csv --output Test_Jobs/Test_Out
I did discover that for some reason the --output parameter is not working properly while testing
I am wondering if maybe there is a problem with your nextflow binary or the packages you need to load to use it.
I am using nextflow version 22.10.6.5844
Mine is 22.04.5.5709 Can you send me the path to your binary?
It was from my ~/Scripts so I copied it to shared/shared-scripts
I think I also run Java 11.0.2 via lmod
we definitely need an sbatch script for Yggdrasil that loads the right modules and ensures we have the right binaries
very strange,
to replicate this working result just do:
nextflow run Yggdrasil/ --rawdata Test_Jobs/Test_Data/SeqOnly/221111_VH00947_17_AACGHWHM5/ --samplesheet Test_Jobs/Test_Data/SeqOnly/CTG_SampleSheet.csv --output Test_Jobs/Test_Out
I did discover that for some reason the --output parameter is not working properly while testing
I see now that it should be --outdir instead of --output my bad!
Lets check it out with outdir instead!
we definitely need an sbatch script for Yggdrasil that loads the right modules and ensures we have the right binaries
Agree! But this should maybe be done in the bash script that initializes Yggdrasil in cron?
That would make the most sense, I could write one now and put it in Yggdrasil/bin
Nvm Im stupid, I thought you meant for my script...
Nvm Im stupid, I thought you meant for my script...
outdir? no unfortunately that was just my bad when I was testing
It crashed again with your binary with the same error. I guess we will postpone the deployment of Yggdrasil until we can get stable demultiplexing.
What I tried to do: Run yggdrasil on a function illumina v2 samplesheet: The command I ran:
What happened: The demultiplex process crashed with the message:
It looks like bclconvert thinks its too little space for a demux, but there is more than enough space available!