Closed daria-dc closed 3 years ago
Hi Daria,
For 1. would you mind sharing the file you input as the 6th argument (starting from 1) to singularity/run_recount_unify.sh
:
i.e. SAMPLE_ID_MANIFEST_HOST
?
For 2. the sort in the rejoin_exons
rule in the Unifier will just use what $TMPDIR
is defined as by Singularity running the container, it's not something set by the Unifier.
So my suspicion is there's something unexpected going on with how the /scratch
filesystem is being mapped into the running container by Singularity in that environment (or there's a permissions problem). If the /scratch
filesystem (assuming the host path, not the container path) isn't being mapped correctly into Singularity, you'll get something like that error.
In general, which filesystems are mapped (and how) into Singularity containers is system/cluster specific.
However, one nice thing about Singularity (as you may already know) is that it's pretty easy to export environmental variables into the container from the calling script (here being singularity/run_recount_unify.sh
).
So one thing I might try is checking the value of $TMPDIR
in your calling environment and also changing it to a filesystem you know will be accessible to the container (.e.g. /tmp
) in singularity/run_recount_unify.sh
.
Further you could try manually shelling into the Unifier container within your environment where you'd normally run it, and check the following,
singularity shell recount-unify-1.0.4.simg echo $TMPDIR
Maybe run a test sort as well using various filesystems which can be mounted via the -B HOST_FILESYSTEM_PATH:CONTAINER_FILESYSTEM_PATH
Singularity argument.
Hope that helps in the debug process.
Hi Christopher,
thanks for your suggestions.
I managed to resolve 2. with some help of our cluster admin and changing the configuration of Singularity.
For 1. my metadata file looks like this:
study_id sample_id
BS1 R2809
BS1 R2810
BS1 R2816
BS1 R2825
Best regards,
Daria
Not sure about the 1st problem, but I might suggest making both the study and sample IDs longer, maybe add 3 more digits to each just in case there's an issue with the ID generation that the Unifier does (I typically have much longer IDs in my runs).
Making study and sample IDs longer resolved the issue!
Hi @ChristopherWilks,
I am running your pipeline on some data in order to later on integrate them with data in recount3. In the unifier step, however, I am running into two issues:
1) Every second line in the file ids.tsv is:
The lines in between are just fine though, consisting of _study_id run railid. If I leave the file as it is, the rule _rejoin_genesfinal fails, because there is only 1 field instead of 3 in the input file. If I remove the ERROR lines from the ids.tsv file before this rule gets executed, it finishes successfully. Do you have an idea what these ERROR lines mean and how to get rid of them?
2) The rule _rejoinexons fails with the following statement:
But actually, the directory is there. Do you know if this might be some singularity issue (I am using version 3.6.0) or what might be causing this?
Thanks for your help!
Daria