10XGenomics / cellranger

10x Genomics Single Cell Analysis
https://www.10xgenomics.com/support/software/cell-ranger
Other
342 stars 91 forks source link

cellranger (v5.0) mkref --nthreads=16 is not passed onto STAR command #107

Closed riyuebao closed 3 years ago

riyuebao commented 3 years ago

Hello - I was checking star log files and realized the number of threads specified on cellranger mkref is not passed onto the actual star run. Would you please take a look? The commands and logs are shown below.

Thanks!!

Best, Riyue

cellranger command

cellranger mkref --genome=${genome} --fasta=${genome}.fa --genes=$annotation.filtered.gtf --nthreads=16 --memgb=32

cellranger mkref log file shows --nthreads=16

[ Mon Dec 28 15:09:33 EST 2020 ] Make ref index
['/ihome/crc/install/cellranger/cellranger-5.0.0/bin/rna/mkref', '--genome=GRCh38.primary_assembly.genome', '--fasta=GRCh38.primary_assembly.genome.fa', '--genes=gencode.v35.primary_assembly.annotation.maskPAR.filtered.gtf', '--nthreads=16', '--memgb=32']
Dec 28 15:15:38 ..... started STAR run
Dec 28 15:15:38 ... starting to generate Genome files
Dec 28 15:17:32 ... starting to sort Suffix Array. This may take a long time...
Dec 28 15:17:42 ... sorting Suffix Array chunks and saving them to disk...

actual log file from STAR Log.out still shows --runThreadN 1

STAR version=                 2.7.2a
STAR compilation time,server,dir=__REDACTED__
##### Command Line:
STAR --runMode genomeGenerate --genomeDir GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/star --runThreadN 1 --genomeFastaFiles GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/fasta/genome.fa --sjdbGTFfile GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/genes/genes.gtf --limitGenomeGenerateRAM 34359738368 --genomeSAsparseD 2 --genomeSAindexNbases 14 --genomeChrBinNbits 18
##### Initial USER parameters from Command Line:
###### All USER parameters from Command Line:
runMode                       genomeGenerate     ~RE-DEFINED
genomeDir                     GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/star     ~RE-DEFINED
runThreadN                    1     ~RE-DEFINED
genomeFastaFiles              GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/fasta/genome.fa        ~RE-DEFINED
sjdbGTFfile                   GRCh38.primary_Gencode35_maskPAR/GRCh38.primary_assembly.genome/genes/genes.gtf     ~RE-DEFINED
limitGenomeGenerateRAM        34359738368     ~RE-DEFINED
genomeSAsparseD               2     ~RE-DEFINED
genomeSAindexNbases           14     ~RE-DEFINED
genomeChrBinNbits             18     ~RE-DEFINED
##### Finished reading parameters from all sources
evolvedmicrobe commented 3 years ago

Hi @riyuebao, this is unfortunately intentional behavior at the moment. STAR with multiple threads utilizes OpenMP for multithreading, and this library can create issues when run on older hardware that does not support AVX2 instructions. To avoid that problem, we've temporarily disabled that option (you can see the message about this in the help text), until we can find time to solve it (hopefully) in the next software release.

In the interim, the only work around if you need multiple threads is to download the last version of cellranger which supports threading on all supported platforms.

riyuebao commented 3 years ago

Hello - thanks for getting back to me so quickly!

I was using cellranger 5.0, which I thought was the most recent version which should have the multithreading? if not, would you please point me what is the last version?

Thanks!!

Best, Riyue

On Mon, Dec 28, 2020 at 4:03 PM Nigel Delaney notifications@github.com wrote:

Hi @riyuebao https://github.com/riyuebao, this is unfortunately intentional behavior at the moment. STAR with multiple threads utilizes OpenMP for multithreading, and this library can create issues when run on older hardware that does not support AVX2 instructions. To avoid that problem, we've temporarily disabled that option (you can see the message about this in the help text), until we can find time to solve it (hopefully) in the next software release.

In the interim, the only work around if you need multiple threads is to download the last version of cellranger which supports threading on all supported platforms.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/10XGenomics/cellranger/issues/107#issuecomment-751863307, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMWBIOBJWCB5NFSTYLZ6I3SXDXATANCNFSM4VMOH2FA .

evolvedmicrobe commented 3 years ago

5.0 is best to use except for this one issue, for which the old version available here:

https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/4.0/

provides the work around.

riyuebao commented 3 years ago

Great thank you!! Will try :)

Best, Riyue

On Mon, Dec 28, 2020 at 4:20 PM Nigel Delaney notifications@github.com wrote:

5.0 is best to use except for this one issue, for which the old version available here:

https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/4.0/

provides the work around.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/10XGenomics/cellranger/issues/107#issuecomment-751867716, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMWBIIHXDEHFNXCAPKQNWDSXDZA7ANCNFSM4VMOH2FA .

cgalicia1014 commented 3 years ago

Hi, I am having the same issue in version 6.0, I'm guessing the issue was not solved with the new release?

plijnzaad commented 3 years ago

Dear @evolvedmicrobe,

this very bad practice! Using the --nthreads option should simply crash if it doesn't work!! I have waited hours for jobs to start (the queueing system has more difficulty finding a node that has e.g. 16 cores available), then paying for 16 cores (while only 1 is being used), then running out of requested run time because I only had one core, wasting everyones time and my money. Why is this not solved, if all cellranger mkref seems to do is call STAR --runMode genomeGenerate --runThreadN 1 _other_args_ ?

Is there any post-processing done by cellranger mkref after the call to STAR, the full command line of which is in the Log.out ?

Disgruntled,

Philip

plijnzaad commented 3 years ago

PS: this is cellranger 5.0.1

jyang635 commented 2 years ago

Thank you all for the input here. I was using cellranger 6.0.2 to build a custome reference genome and spent a lot of time to figure out why the pipelines hung at this step. @evolvedmicrobe, I think it's important to mention it on the cellranger website (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/references) even though 10X already included the message in the help text.

evolvedmicrobe commented 2 years ago

Hi @jyang635 and @plijnzaad,

This is indeed an unfortunate situation and one I can assure we were looking to improve. @plijnzaad , in answer to your question, the problem is that simply calling STAR with multiple threads in our code leads to a SEGFAULT on unsupported hardware, so we disabled this option as we felt it was better for the code path to work slowly than not work at all and/or exit with a very confusing error message.

A new version of Cell Ranger will be released in the next few weeks that re-enables the nthreads option as well as having several other more meaningful improvements. We expect this will resolve this issue and avoid the need for further documentation around the previously disabled (and now re-enabled) option.

Warm wishes, Nigel

plijnzaad commented 2 years ago

Has this issue been resolved? If so, can you give the specifics? If not, can you please not mark it closed? Thanks, Philip

evolvedmicrobe commented 2 years ago

@plijnzaad the latest version of Cell Ranger passes nthreads onto STAR and does not show this issue. There are no plans to back port this fix to previous versions.