Closed lvclark closed 1 year ago
Yes, sadly collapse range hasn't been kept up to date and rather than fixing it we're actually in the process of refactoring the code to accommodate batch runs more easily.
In the meantime you can simply repeat what the collapse range wrapper is doing. I've written the below without the ability to check it so you may have to make a few tweaks (or feel free to comment again here).
# partition the bed file into independent regions
sort -k1,1 -k2,2n --parallel=<cpus> input.bed > sorted.bed
bedPartition -parallel=<cpus> sorted.bed ranges.bed
bgzip sorted.bed
tabix -f --preset bed --zero-based sorted.bed.gz
mkdir rundir
cat ranges.bed | while read chr start end; do
echo "flair collapse --range ${chr}:${start}-${end} -q sorted.bed.gz --threads (...rest of your inputs...) --output rundir/$chr$start$end.heart2.flair_collapsed" ;
done > my.commands
# Run these on your cluster independently, then combine:
cat rundir/*isoforms.bed > $OUTDIR/heart2.flair_collapsed.isoforms.bed
cat rundir/*isoforms.fa > $OUTDIR/heart2.flair_collapsed.isoforms.fa
cat rundir/*isoforms.gtf > $OUTDIR/heart2.flair_collapsed.isoforms.gtf
Ok, thanks for the reply! I am in the process of doing something similar based on the shell script that I found in the repo.
Copy and paste the exact command you tried to run
I was on a PBS job with 8 threads and 64 Gb of memory.
How did you install Flair?
I had to build my own Apptainer container, since
htslib
andbedPartition
were required forcollapse-range
to run but weren't on the container. I made a Docker image that I converted to Apptainer. Here is my Dockerfile.What happened?
What else do we need to know?
Some of the chromosome names contain underscores, so based on previous experience with Flair I am on the lookout for that causing issues.
My corrected BED is 20 Gb, hence the need to use
collapse-ranges
! This is PacBio MAS-seq data.