chernolab / ASpli

BioC current release of ASpli
4 stars 1 forks source link

gbCounts errors #4

Closed dzijlmans closed 2 years ago

dzijlmans commented 2 years ago

Hi, I am having issues with running gbCounts. I am running into alternate errors when trying to execute gbCounts:

> BAMfiles <- list.files(pattern = ".bam$")

> BAMfiles
 [1] "Ctrl_rep1.bam"     "Ctrl_rep2.bam"     "Ctrl_rep3.bam"     "Ctrl_rep4.bam"     "dTAG_24h_rep1.bam"
 [6] "dTAG_24h_rep2.bam" "dTAG_24h_rep3.bam" "dTAG_24h_rep4.bam" "dTAG_4h_rep1.bam"  "dTAG_4h_rep2.bam" 
[11] "dTAG_4h_rep3.bam"  "dTAG_4h_rep4.bam"  "siSFRS2_rep1.bam"  "siSFRS2_rep2.bam"  "siSFRS2_rep3.bam" 
[16] "siSFRS2_rep4.bam" 

> targets <- data.frame(bam = BAMfiles,
                      treatment = c("ctrl", "dtag_24h", "dtag_24h", "dtag_4h",
                                    "dtag_4h", "siSFRS2", "siSFRS2", "ctrl",
                                    "ctrl", "ctrl", "dtag_24h", "dtag_24h",
                                    "dtag_4h", "dtag_4h", "siSFRS2", "siSFRS2"),
                      stringsAsFactors = FALSE)

> getConditions(targets)
[1] "ctrl"     "dtag_24h" "dtag_4h"  "siSFRS2"

> gbcounts <- gbCounts( features = features,
                       targets = targets,
                       minReadLength = 100, maxISize = 50000,
                       libType="SE",
                       strandMode=0)
Summarizing ctrl_1
Error: cannot allocate vector of size 896.0 Mb
In addition: Warning message:
In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.

This is unusual, since there are no '.' present in the bam file names. I made sure to remove these to avoid this error. However they are present in the index files (.bai). The names there are e.g. 'Ctrl_rep1.bam.bai'. I removed the '.bam' from the index files name and ran it again. Then it gives me the following error:

> gbcounts <- gbCounts( features = features,
                       targets = targets,
                       minReadLength = 100, maxISize = 50000,
                       libType="SE",
                       strandMode=0)
Summarizing ctrl_1
Error in value[[3L]](cond) : 
  'Realloc' could not re-allocate memory (104857600 bytes)
  file: Ctrl_rep1.bam
  index: NA

However, if I put the name of the index file back to how it originally was (to '.bam.bai', I receive this same memory error. I am not sure what is going wrong here. I would greatly appreciate your help!

dzijlmans commented 2 years ago

Note: Freeing usused R memory alleviates the memory error, but this still leaves the vector allocation error

estepi commented 2 years ago

Hi!

Yes, It seems it is a memory problem... Can you try with less BAM files? How big are they?

How many memory does your computer has?

Also, I don't get the issue with BAI files. They are not used till plotting step, they shouldn't cause problems. In any case, be careful with renaming... their prefix should be exactly the same as BAM names

thanks

dzijlmans commented 2 years ago

Thanks for the quick reply!

I have 16 bam files in total, around 4-5gb each. I've tried to run ASpli on a single file, but I keep getting the error.

I am running ASpli on my laptop, which has 4 cores and 16gb of RAM. That's a lot less than you used before, but it should still be able to process one file, right?

Of note, when I ran ASpli with 4 bam files I got a slightly different error message. Not sure if this is helpful.

> gbcounts <- gbCounts( features = features,
+                       targets = test,
+                       minReadLength = 100, maxISize = 50000,
+                       libType="SE",
+                       strandMode=0)
Summarizing dtag_24h_1
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'splitAsList': cannot allocate vector of size 831.6 Mb
In addition: Warning message:
In FUN(X[[i]], ...) :
  Some seqnames had a '.' present in their names. ASpli had to normalize them using '_'.
dzijlmans commented 2 years ago

Hi, I switched from my work laptop to our online server (64 cores, 512GB RAM), and the analysis ran in full without any additional issues. I would reccomend you add a part on the minimal memory requirements to the vignette to prevent others from having these issues