cellgeni / STARsolo

wrapper scripts for convenient STARsolo processing of 10X and other scRNA-seq
GNU General Public License v3.0
44 stars 5 forks source link

*OUTPUT FILE* error: could not create output file when running RSEM with single-cell RNA-seq data #4

Closed myliu221 closed 1 year ago

myliu221 commented 1 year ago

Hi,

I tried to run RSEM for the single-cell RNA-seq data. The current command I used was as follows,

STAR --runThreadN 4 --genomeDir $REF --readFilesIn $R1 $R2 --runDirPerm All_RWX --readFilesCommand zcat --quantMode TranscriptomeSAM --outSAMtype BAM Unsorted --soloBarcodeMate 1 --clip5pNbases 39 0 --soloType CB_UMI_Simple --soloCBwhitelist 3M-february-2018.txt --soloCBstart 1 --soloCBlen 16 --soloUMIstart 17 --soloUMIlen 12 --soloStrand Forward --soloUMIdedup 1MM_CR --soloCBmatchWLtype 1MM_multi_Nbase_pseudocounts --soloUMIfiltering MultiGeneUMI_CR --soloCellFilter EmptyDrops_CR --outFilterScoreMin 30 --soloFeatures Gene GeneFull Velocyto --soloOutFileNames output/ features.tsv barcodes.tsv matrix.mtx --soloMultiMappers EM --outReadsUnmapped Fastx

I got the same error "SoloFeature_processRecords.cpp:64:processRecords: exiting because of OUTPUT FILE error: could not create output file ./output/GeneFull/Features.stats SOLUTION: check that the path exists and you have write permission for this file. Also check ulimit -n and increase it to allow more open files. " no matter I added "ulimit -n 10000" or I reduced the thread number.

The STAR version I used is 2.7.10b.

I am wondering if I can integrate RSEM with STARsolo for single-cell RNA-seq data. It would be great if you can give me some suggestions on parameter setting. Thanks.

Best, Mingyu

apredeus commented 1 year ago

Hi! Thank you for trying out our wrapper scripts. The error you're getting does not seem to be related to RSEM - rather, seems like you don't have the space or the permission to write the files in the directory you're in? I would not recommend running separate commands - instead, run the whole starsolo_10x_auto.sh script if you're trying to process a 10x experiment.

What is the motivation of you trying to run RSEM on single-cell RNA-seq data? I've removed RSEM part from the latest commit because it's actually for bulk. STARsolo now can do what RSEM would be good for - namely, counting reads that map to multiple genes. The option --soloMultiMappers EM does exactly that.

myliu221 commented 1 year ago

Hi. Thanks for your response. I would like to use RSEM to quantify the isoform expression for single-cell RNA-seq data. I found there is a mode that we can quantify splice junctions by setting solofeature as SJ. I am wondering if the isoform expression can be inferred from splice junctions' expression. Or are there other parameter settings for isoform quantification in STARsolo?

Thanks, Mingyu

apredeus commented 1 year ago

Hi Mingyu,

I don't think I've seen isoform quantification options in STARsolo. I think your best option would be to use salmon with the "full decoy" option - it should be almost as accurate as STARsolo, very fast, and give isoform-level expression estimates.