PierreBSC / Viral-Track

MIT License
54 stars 27 forks source link

XStringSet object is too big to be unlisted #14

Closed linquynus closed 3 years ago

linquynus commented 3 years ago

Hi,

Thanks a lot for the useful tool. When I tried to run "Viral_Track_scanning.R" with COVID-19 data from your paper (GEO Datasets: GSM4339774), it reported the following error right after read alignment:

XStringSet object is too big to be unlisted (would result in an XString object of length 2^31 or more)

Could you please kindly help me resolve the error? By the way, I used the 10X genomics "737K-august-2016.txt" as whitelist file (is it appropriate?).

Best, Quy

PierreBSC commented 3 years ago

Hi Quy,

Thank you a lot for your feedback. Can you put a screenshot of the log ? Would greatly help me to understand what is going on. Have you checked the other input files ?

Best

Pierre

linquynus commented 3 years ago

Hi Pierre,

Sorry for the late response. Got some issue in the server. I reran a sample and got the same error as below.

Loading of the libraries.... ... done ! 1 Fastq files are going to be processed!
Mapping combined.extracted.R2.fastq file Oct 23 13:22:28 ..... started STAR run Oct 23 13:22:28 ..... loading genome Oct 23 13:30:59 ..... started 1st pass mapping Oct 23 15:15:27 ..... finished 1st pass mapping Oct 23 15:15:31 ..... inserting junctions into the genome indices Oct 23 15:22:27 ..... started mapping Oct 23 18:23:58 ..... finished mapping Oct 23 18:24:02 ..... started sorting BAM Oct 23 19:49:56 ..... finished successfully Mapping of combined.extracted.R2.fastq done ! All fastq files have been mapped successfully Starting the BAM file analysis Indexing of the bam file for combined.extracted.R2 is done Computing stat file for the bam file for combined.extracted.R2 is done Checking the mapping quality of each virus... Export of the viral SAM file done for combined.extracted.R2 Error in { : task 1 failed - "XStringSet object is too big to be unlisted (would result in an XString object of length 2^31 or more)" Calls: %dopar% -> Execution halted

linquynus commented 3 years ago

Hi,

I read your R codes and found out the cause. The reference genome I used contains "chr" for chromosome 1-22, X, and Y. In your R codes, it seems you expect users to use the reference without "chr".

I have removed "chr" in the reference genome fasta file and now it works.

zhouxinseeu commented 3 years ago

Hi! @linquynus I got the same error as yours. You said removing "chr" in the reference genome fasta file works, how to check fa file to see where is "chr"