GoekeLab / bambu

Reference-guided transcript discovery and quantification for long read RNA-Seq data
GNU General Public License v3.0
190 stars 24 forks source link

" double free or corruption (!prev) " when Start isoform quantification #412

Closed mechealeth closed 10 months ago

mechealeth commented 10 months ago

Hi, I would like to use bambu for my isoseq analysis but I have encountered a " double free or corruption (!prev) " error. The detail as following: the error info as following : Transcript names will be made unique --- Start generating read class files --- Detected 2 warnings across the samples during read class construction. Access warnings with metadata(bambuOutput)$warnings --- Start extending annotations --- The current classes, please consider less strigent critria! --- Start isoform quantification --- Error in `/xx/lib/R/bin/exec/R': double free or corruption (!prev): 0x0000555b1bf2b330 ======= Backtrace: ========

my run code as follow: test.bam <- "/xx/tumour_PacBio.hs1.aligned.bam" fa.file <- "/xx/hs1.fa" bambuAnnotations <- prepareAnnotations("/xx/hs1.ncbiRefSeq.gtf") se <- bambu(reads = test.bam, annotations = bambuAnnotations, genome = fa.file,NDR = 0.1)

sessioninfo: [1] bambu_3.4.0 BSgenome_1.70.1 [3] rtracklayer_1.62.0 BiocIO_1.12.0 [5] Biostrings_2.70.1 XVector_0.42.0 [7] SummarizedExperiment_1.32.0 Biobase_2.62.0 [9] GenomicRanges_1.54.1 GenomeInfoDb_1.38.1 [11] IRanges_2.36.0 S4Vectors_0.40.2 [13] BiocGenerics_0.48.1 MatrixGenerics_1.14.0 [15] matrixStats_1.0.0

many thanks

andredsim commented 10 months ago

Hi,

Thanks for sharing your error message and session info. It looks like one of the warnings got truncated and it should say: "The current filtering criteria filters out all new read classes, please consider less stringent criteria!" I don't think this is the direct cause of the issue but might be a symptom of the cause.

A few things to confirm and test that will help me resolve this issue:

  1. Can you please share head(bambuAnnotations). I notice this isn't a standard gtf file name, may I ask where you got this file and if it was translated from a gff3, or through other means?
  2. Does the gtf file chromosome scaffold names match the fasta file you are using (for example chr1 vs 1 will cause issues)
  3. Is the genome fasta file you are using the same genome fasta file that you aligned the reads to?
  4. Is this a standard bulk run, or was there any selection to only a few genes? Has any subsetting to the data been done?
  5. Does bambu run successfully if you get NDR = 1 and verbose=TRUE? Note that even if it is successful, because of the above mentioned warning, I believe there may be other issues. Please post the warnings after running it with verbose on too.

Let me know and I will do my best to help.

Kind Regards, Andre Sim

mechealeth commented 10 months ago

Hi Andre, Thanks for your reply and troubleshooting.

  1. I got the hs1.ncbiRefSeq.gtf from UCSC(https://hgdownload.soe.ucsc.edu/goldenPath/hs1/bigZips/genes/) and the head(bambuAnnotations) show as follow: Screenshot 2024-01-24 at 8 42 05 am

    2 .The gtf file chromosome scaffold names match the fasta file I are using but there are warining saying that "not all chromosomes present in reference annotations"(show as point 5 figure) show as follow :

    Screenshot 2024-01-24 at 8 59 38 am

    3.Yes, I aligned the reads to the same genome fasta(hs1.fa)

  2. It's a standard bulk run and I don't subset it.
  3. It still have same error message "double free or corruption (!prev)" when running with argument NDR = 1 and verbose=TRUE. Many thanks mc
andredsim commented 10 months ago

Hi MC,

Thanks for this. From what you have posted I can eliminate the normal suspects that cause issues as the run looks fully normal up until the isoform quantification step. It seems to be related to a C++ issue which is harder for us to determine what is causing it. Are you able to share the data with us so that we can try replicate this error on our end? We would need the bam file, the fasta and gtf you are using. If you are unable to share the data, let me know and we will do our best to continue trying to troubleshoot it.

As I see this is a run with multiple samples it could be that only 1 bam file might be triggering this error. To test this you can run bambu as normal but set quant = FALSE, rcOutDir = "/output/path/". Then run each bam file through quantification separately using the RDS files produced from the first run. Be sure to change the rcOutDir path to something suitable for you and the reads argument to the file names produced. extendedAnno <- bambu(reads = c("sample1.bam", "sample2.bam", "".....), annotations = bambuAnnotations, genome = fa.file,NDR = 0.1, quant = FALSE) extendedAnno <- bambu(reads = "/output/path/XXXX.rds", annotations = bambuAnnotations, genome = fa.file,NDR = 0.1, quant = FALSE)

See this section on how to use the output rds files if you are unsure which output file corresponds to which input. https://github.com/GoekeLab/bambu?tab=readme-ov-file#Storing-and-using-preprocessed-files-rcFiles

Kind Regards, Andre Sim

andredsim commented 10 months ago

Also have you tried running this not in a conda environment, some further reading of this error suggests it might be linked to conda and the opencv version of the sytem

mechealeth commented 10 months ago

Hi Andre,

It still has the same error when running Bambu not in a conda environment .the error as follow:

--- Start isoform quantification --- Error in `/sw/el7/R/4.3.1/lib64/R/bin/exec/R': double free or corruption (!prev): 0x000000008b35dc20 ======= Backtrace: ========= /lib64/libc.so.6(+0x81329)[0x7f4250e85329] /sw/el7/R/4.3.1/lib64/R/library/data.table/libs/data_table.so(+0x2b833)[0x7f4238908833] /sw/el7/R/4.3.1/lib64/R/library/data.table/libs/data_table.so(forder+0x1a51)[0x7f423890b9cc] /sw/el7/R/4.3.1/lib64/R/bin/exec/R[0x4986da] /sw/el7/R/4.3.1/lib64/R/bin/exec/R[0x500dcd] /sw/el7/R/4.3.1/lib64/R/bin/exec/R(Rf_eval+0x1e4)[0x4df380]

I will sen you my data in RDS file later. Many thanks mc


发件人: Andre Sim @.> 发送时间: 2024年1月24日 11:23 收件人: GoekeLab/bambu @.> 抄送: mkkk @.>; Author @.> 主题: Re: [GoekeLab/bambu] " double free or corruption (!prev) " when Start isoform quantification (Issue #412)

Also have you tried running this not in a conda environment, some further reading of this error suggests it might be linked to conda and the opencv version of the sytem

― Reply to this email directly, view it on GitHubhttps://github.com/GoekeLab/bambu/issues/412#issuecomment-1907291562, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AP3BZUESQ2PR7NI5SOZZO5DYQB5EVAVCNFSM6AAAAABCGJAJMGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBXGI4TCNJWGI. You are receiving this because you authored the thread.Message ID: @.***>

andredsim commented 10 months ago

Thanks for trying that, if you have the bandwidth to try more while you generate the rds files could you try the following line before bambu() setDTthreads(4) #or try 1 if this also doesn't work We are thinking it might be an issue with the data.table using more CPUs than expected. https://github.com/Rdatatable/data.table/issues/5186

mechealeth commented 10 months ago

Hi Andre, setDTthreads(4) Works !! Thanks a lot. Many thanks mc