eudoraleer / scasa

SCASA: Single cell transcript quantification tool
GNU General Public License v3.0
19 stars 4 forks source link

Quantification Fails #2

Open abhideeplife opened 2 years ago

abhideeplife commented 2 years ago

Hi,

Thank you for the tool. I am using it for isoform quantification. As a test run I am testing it on one sample from 10xV2.

The command that I am using

scasa --fastq ERX3806131/4861STDY7462259_R1.fastq.gz,ERX3806131/4861STDY7462259_R2.fastq.gz --ref $refPath --nthreads 8 --out Scasa_out

Processing message that I get

##############################################################

SCASA V1.0.0

SINGLE CELL TRANSCRIPT QUANTIFICATION TOOL

Version Date: 2021-04-07

FOR ANY ISSUES, CONTACT: LU.PAN@KI.SE

https://github.com/eudoraleer/scasa/

##############################################################

mkdir: cannot create directory ‘Scasa_out/SCASA_My_Project_20220527085718/’: File exists

Preparing for alignment.. Indexing reference.. Directory Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/ already exists. Writing into existing directory.. Version Info: ### PLEASE UPGRADE SALMON ###

A newer version of salmon with important bug fixes and improvements is available.

The newest version, available at https://github.com/COMBINE-lab/salmon/releases contains new features, improvements, and bug fixes; please upgrade at your earliest convenience.

Sign up for the salmon mailing list to hear about new versions, features and updates at: https://oceangenomics.com/subscribe

[2022-05-27 08:57:18.759] [jLog] [warning] The salmon index is being built without any decoy sequences. It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.

[2022-05-27 08:57:18.759] [jLog] [info] building index out : Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/ [2022-05-27 08:57:18.759] [puff::index::jointLog] [info] Running fixFasta

[Step 1 of 4] : counting k-mers

[2022-05-27 08:57:26.609] [puff::index::jointLog] [warning] Removed 237 transcripts that were sequence duplicates of indexed transcripts. [2022-05-27 08:57:26.609] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the --keepDuplicates flag [2022-05-27 08:57:26.610] [puff::index::jointLog] [info] Replaced 5 non-ATCG nucleotides [2022-05-27 08:57:26.610] [puff::index::jointLog] [info] Clipped poly-A tails from 12501 transcripts wrote 70629 cleaned references [2022-05-27 08:57:27.256] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers [2022-05-27 08:57:30.321] [puff::index::jointLog] [info] ntHll estimated 84081876 distinct k-mers, setting filter size to 2^31 Threads = 2 Vertex length = 31 Hash functions = 5 Filter size = 2147483648 Capacity = 2 Files: Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/ref_k31_fixed.fa

Round 0, 0:2147483648 Pass Filling Filtering 1 25 69 2 4 0 True junctions count = 266516 False junctions count = 404633 Hash table size = 671149 Candidate marks count = 4093104

Reallocating bifurcations time: 0 True marks count: 2954071 Edges construction time: 4

Distinct junctions = 266516

allowedIn: 12 Max Junction ID: 308126 seen.size():2465017 kmerInfo.size():308127 approximateContigTotalLength: 65012593 counters for complex kmers: (prec>1 & succ>1)=25336 | (succ>1 & isStart)=59 | (prec>1 & isEnd)=73 | (isStart & isEnd)=10 contig count: 417773 element count: 96576272 complex nodes: 25478

of ones in rank vector: 417772

[2022-05-27 08:59:28.244] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file. [2022-05-27 08:59:28.244] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX size = 96576272

| Loading contigs | Time = 9.0059 ms

size = 96576272

| Loading contig boundaries | Time = 5.0433 ms

Number of ones: 417772 Number of ones per inventory item: 512 Inventory entries filled: 816 417772 [2022-05-27 08:59:28.456] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure. [2022-05-27 08:59:28.460] [puff::index::jointLog] [info] contig count for validation: 417772 [2022-05-27 08:59:28.648] [puff::index::jointLog] [info] Total # of Contigs : 417772 [2022-05-27 08:59:28.648] [puff::index::jointLog] [info] Total # of numerical Contigs : 417772 [2022-05-27 08:59:28.676] [puff::index::jointLog] [info] Total # of contig vec entries: 3035777 [2022-05-27 08:59:28.676] [puff::index::jointLog] [info] bits per offset entry 22 [2022-05-27 08:59:28.787] [puff::index::jointLog] [info] Done constructing the contig vector. 417773 [2022-05-27 08:59:28.924] [puff::index::jointLog] [info] # segments = 417772 [2022-05-27 08:59:28.924] [puff::index::jointLog] [info] total length = 96576272 [2022-05-27 08:59:28.957] [puff::index::jointLog] [info] Reading the reference files ... [2022-05-27 08:59:29.688] [puff::index::jointLog] [info] positional integer width = 27 [2022-05-27 08:59:29.688] [puff::index::jointLog] [info] seqSize = 96576272 [2022-05-27 08:59:29.688] [puff::index::jointLog] [info] rankSize = 96576272 [2022-05-27 08:59:29.688] [puff::index::jointLog] [info] edgeVecSize = 0 [2022-05-27 08:59:29.688] [puff::index::jointLog] [info] num keys = 84043112 for info, total work write each : 2.331 total work inram from level 3 : 4.322 total work raw : 25.000 [Building BooPHF] 100 % elapsed: 0 min 10 sec remaining: 0 min 0 sec Bitarray 440364608 bits (100.00 %) (array + ranks ) final hash 0 bits (0.00 %) (nb in final hash 0) [2022-05-27 08:59:39.507] [puff::index::jointLog] [info] mphf size = 52.4956 MB [2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk size = 48288136 [2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk 0 = [0, 48288136) [2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk 1 = [48288136, 96576242) [2022-05-27 08:59:52.325] [puff::index::jointLog] [info] finished populating pos vector [2022-05-27 08:59:52.325] [puff::index::jointLog] [info] writing index components [2022-05-27 08:59:52.728] [puff::index::jointLog] [info] finished writing dense pufferfish index [2022-05-27 08:59:52.766] [jLog] [info] done building index Finnished indexing reference.. Begins pseudo-alignment.. nohup: redirecting stderr to stdout

The ERROR that I am getting as soon as the quantification step starts is below

Congratulations! Pseudo-alignment has completed in 1590 seconds! Scasa quantification has started.. Begin Scasa quantification for sample 4861STDY7462259.. Loading required package: iterators Loading required package: parallel Error in { : task 1 failed - "NA/NaN argument" Calls: %dopar% -> Execution halted Loading required package: iterators Loading required package: parallel Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection Calls: load -> readChar In addition: Warning message: In readChar(con, 5L, useBytes = TRUE) : cannot open compressed file '/home/jupyter/Scasa_out/SCASA_My_Project_20220527085718/2QUANT/4861STDY7462259_quant/Sample_eqClass.RData', probable reason 'No such file or directory' Execution halted Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection Calls: load -> readChar In addition: Warning message: In readChar(con, 5L, useBytes = TRUE) : cannot open compressed file 'Scasa_out/SCASA_My_Project_20220527085718/2QUANT//4861STDY7462259_quant//scasa_isoform_expression.RData', probable reason 'No such file or directory' Execution halted Congratulations! Scasa single cell RNA-Seq transcript quantification has completed in 30 seconds! All done!

I have installed all the R packages and I am not sure why the quantification is not being performed.

Could you please help.

Thank you

eudoraleer commented 2 years ago

Error in { : task 1 failed - "NA/NaN argument"

Hihi,

It looks like something is wrong with either the R library or the data itself. Do you have the post-alignment file?

Best, Lu

ThepeachYolado commented 2 years ago

Error in { : task 1 failed - "NA/NaN argument"

Hihi,

It looks like something is wrong with either the R library or the data itself. Do you have the post-alignment file?

Best, Lu

Hi, I had the same problem. Could you please help. Thanks!

nghiavtr commented 2 years ago

@yangwh1998

Please consider using docker to run scasa to avoid the issue of installing the dependencies.

How to run scasa with docker is provided here: https://github.com/eudoraleer/scasa/blob/main/README.md#using-docker-to-run-scasa

Best, Nghia