Pathogen-Genomics-Cymru / lodestone

Mycobacterial pipeline
GNU Affero General Public License v3.0
11 stars 3 forks source link

TB-profiler new issue #62

Closed MarcNiebel closed 1 month ago

MarcNiebel commented 3 months ago

Unfortunately a new error has come up. It seems similar to previous one with tb-profiler error tbprofiler.errlog.txt. Again staging of files seems to be an issue (SRR26331605 versus SRR26331609) when looking at directory and subdirectory hierarchy. Not sure if the error actually originates further up the pipeline d2 └── b221477a01b061477e4a4d7e323198 ├── SRR26331605_report.json -> /home/mniebel/tb_pipeline/lodestone/work/66/1e4361b50d3eb5c6ed4398345eed7c/SRR26331605_report.json ├── SRR26331609_allelic_depth.minos.vcf.gz ├── SRR26331609_allelic_depth.minos.vcf.gz.tbi ├── bam ├── results ├── tbprofiler.errlog.txt ├── tmp └── vcf The dataset I am currently running through Lodestone are the 79 samples from the MAGMA paper https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011648 (BioProject PRJNA1026351). I have been able to isolate the problem down to SRR26331605 & SRR26331609.

The command I am using is:

nextflow -bg run main.nf -profile singularity --input_dir MAGMA_dataset_2a_issue_v4/ --pattern '*_{1,2}.fastq.gz' -with-report

WhalleyT commented 3 months ago

Hi again Marc, I'll download it and have a look, will report back

WhalleyT commented 2 months ago

I think you were right it's still a permissions issue. From what I can gather, the --temp argument in TB-profiler only creates a tmp directory for the outputs/intermediate files generated by the tool, it doesn't percolate down to calls in samtools, bcftools et al.

It appears to work for me if I change $TMPDIR or $SINGULARITY_TMPDIR to somewhere that is writeable. Is that the case for you or is it something else?

If it the case I will see if there's a more "proper" way of doing this in tb-profiler and/or get in touch with the authors and try and sort it.

MarcNiebel commented 2 months ago

Have not explored that to be honest. What I am trying to get my head around is why it happens with specific samples only. It seems also to be something seen by others testing out Lodestone with this dataset using --singularity. So can I clarify does this not occur with docker? I think asking Jody Phelan for some help would be a good shout.

WhalleyT commented 2 months ago

Agreed, I can't see an obvious reason, because there's no branching logical to that call to bcftools et al., it's all the same set of pipes.

I've had no problems with -profile docker thus far. I'll try the magma data on CLIMB (where I've been running in docker) and see if that's the case, but from memory, it's always been okay.