STAR has two modes of output: write SAM (default) or write BAM files.
While STAR with SAM output runs very smoothly even with multiple cores
an distributed over all machines (I tested with 25 instances) STAR with
sorted BAM output seems to be a BeeGFS killer.
When run with "--outSAMtype BAM SortedByCoordinate" STAR puts enormous
stress on the BeeGFS storage servers pushing them behind the point of
maximum writes they can do per second. This yields to heavy IO wait and
to an nearly unusable console since the BeeGFS servers also host the
home directories (and the bash completion).
For now, my advice is: write out SAM and after the run convert to BAM.
The samtools conversion does not seem to pose any problems to the cluster.
In principle, this should just entail an extra step after the call to STAR in the estimate-XXX-abundance scripts, as well as updates to the filenames.
This is exactly the same as Issue dieterich-lab/b-tea#41.
From @tjakobi
In principle, this should just entail an extra step after the call to STAR in the
estimate-XXX-abundance
scripts, as well as updates to the filenames.This is exactly the same as Issue dieterich-lab/b-tea#41.