Open NikoLichi opened 1 week ago
Hello there,
I made some progress with a different approach but bambu is still failing in the same issue.
I divided the data in 3 batches to obtain the extended annotations in the .rds
. I set it up like below for each of the 3 batches:
extendedAnnotations = bambu(reads = BAMs_one_per_Line, annotations = bambuAnnotations, genome = fa.file, NDR = 0.1, quant = FALSE, lowMemory=T, ncore = 14, rcOutDir="MY_DIR")
As a small note, I tested before with default NDR, and each batch had an NDR > 0.6, but I am working with human samples, so I decided to fix NDR=0.1.
After this, I put together the .rds
files for each one of the three bambu runs listing the .rds
files for each run in a vector and emerging them like:
B01 <- Myfiles01
B02 <- Myfiles02
B03 <- Myfiles03
B_allRDs <- c(B01,B02,B03)
mergedAnno = bambu(reads = B_allRDs, genome = fa.file, annotations = bambuAnnotations, quant = FALSE)
But then, I got the same error as above and the first lines are:
--- Start extending annotations ---
Error in `vec_interleave_indices()`:
! Long vectors are not yet supported in `vec_interleave()`. Result from interleaving would have size 17715772800, which is larger than the maximum supported size of 2^31 - 1.
I would appreciate any help how to run this massive data set, Best, Niko
Dear Bambu team,
I am running a massive project with 480 BAM files with ~4.8 TB total data. Following the previous suggestion for Bambu, I am running first the extended annotations (
quant = FALSE
), with the idea of running the quantification later in batches.However, the is a major issue when starting the extended annotations:
Is there anything I could do to run Bambu?
My code looks like:
As an additional note, I also have the same warning message as some others have reported as issue #407 .
This is with R 4.3.2 and Bioc 3.18 / bambu (3.4.1). Platform: x86_64-conda-linux-gnu (64-bit)
All the best, Niko