Closed Ni-Ar closed 1 year ago
Hi Nicco,
Sorry about the STAR fail! You can also set the limits even higher, shouldn't hurt! Exactly, you can just set the stage to Mapping as you found.
Just make sure you edit the changes into the .run.yaml that zUMIs has created when you first started the run, there is some things that get added by the pipeline in there.
Small sidenote: Depending on how many resources (threads / memory) you make available for zUMIs, rerunning from Filtering can still be faster because then its possible to run a bunch of STAR jobs in parallel.
Let me know if I can help otherwise! Best, Christoph
Oops I deleted the .run.yaml
file as I thought it might cause some conflicts in the re-running 🤦♂️ I anyway increased the limits and asked for more resources; the re-run job is still queued. I'll see how things develop. Thanks for the help!
Hi again,
I restarted the Mapping
stage without the .run.yaml
file, but the job allocated resources finished shortly after mapping ended and the Counting
started. I believe the start mapping went well as I was checking the .Out.log
file and it showed no error.
So I restarted the Counting
stage (this time with the .run.yaml
file present and I got the following error:
[E::hts_open_format] Failed to open file "/path/to/HagemannJensen_2022/output_dir/hPBMCs_RunF.filtered.Aligned.GeneTagged.sorted.bam.tmp.**NNNN**.bam" : File exists
where "NNNN" applies for files 0000, 0015, 0004, 0011, 0013... (it looks random to me).
If I do the following:
ls -l /path/to/HagemannJensen_2022/output_dir/hPBMCs_RunF.filtered.Aligned.GeneTagged.sorted.bam.tmp* | wc -l
704
There are 704 temp files.
I addition I get this in the error log:
[E::hts_open_format] Failed to open file "/path/to/HagemannJensen_2022/output_dir/hPBMCs_RunF.filtered.Aligned.GeneTagged.sorted.bam" : No such file or directory
samtools index: failed to open "/path/to/HagemannJensen_2022/output_dir/hPBMCs_RunF.filtered.Aligned.GeneTagged.sorted.bam": No such file or directory
Meaning something failed to create a sorted output?
And the error log continues with:
Error in value[[3L]](cond) :
failed to open BamFile: file(s) do not exist:
‘/path/to/HagemannJensen_2022/output_dir/hPBMCs_RunF.filtered.Aligned.GeneTagged.sorted.bam’
Calls: reads2genes_new ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted
Loading required package: yaml
Loading required package: Matrix
Warning message:
package ‘R6’ was built under R version 4.0.5
Error in gzfile(file, "rb") : cannot open the connection
Calls: rds_to_loom -> readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file '/path/to/HagemannJensen_2022/output_dir/zUMIs_output/expression/hPBMCs_RunF.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
and
Error in gzfile(file, "rb") : cannot open the connection
Calls: readRDS -> gzfile
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file '/path/to/HagemannJensen_2022/output_dir/zUMIs_output/expression/hPBMCs_RunF.dgecounts.rds', probable reason 'No such file or directory'
Execution halted
which I assume are subsequent errors cause by the fact that the previous bam file was not generated.
Any solution? Besides re-running everything from the beginning which is gonna take ~5 days.
Thanks a lot! Nicco
Hi Nicco,
Sorry I must have totally missed your message here!
The tmp files you see are from samtools sort, so that must have definitely aborted somehow.
From your description its a bit unclear if the mapping output is really fine or no, so maybe the safest route I would suggest is too bite the bullet and run it from the top.
Make sure that you can have sufficient open file handles so that the samtools sort step can allocate as many tmp files as it needs! using the ulimit -n
command in your shell.
Anyways since I am so late to reply here, you probably already managed to solve this but let me know if I can help further!
best, Christoph
Hey yes, I re-run end-to-end everything and it worked fine. Thanks a lot, Nicco
Not really a bug, more like a general usage question. Apologies if this was described somewhere I couldn't find details about this in the Wiki.
I'm running zUMIs on smart-seq3xpress PBMC data (E-MTAB-11452/combinedFCL.read{1,2}.fastq.gz).
The filtering step went fine, but STAR failed because:
So in the yaml file I'm gonna add the following line that should hopefully solve this problem.
The log output I got so far was:
In order to save time and resources, if I keep my
out_dir
path the same and re-run with the same yaml parameters, can I skip theFiltering
step and repeat only theMapping
?I believe this could be done with this, right?
Versions:
Thanks a lot! Nicco