However, when I try to run this output through Epinano_DiffErr.R I run into the following error:
Rscript Epinano_DiffErr.R -k NHA-hTERT_5mer.sum_err.csv -w DIPG-IV_5mer.sum_err.csv -c 30 -d 0.1 -t 3 -o DIPG-IV_NHA-hTERT_5mer_sumErr --feature sum_err3
Error:
Error in merge.data.frame(dat1, dat2, by = "chr_pos") :
negative length vectors are not allowed
This appears to be due to a memory limit issue.
Note:
I also tried changing line 126 in Epinano_DiffErr.R:
combine <- merge(dat1, dat2, by="chr_pos")
to:
combine <- dplyr::full_join(dat1, dat2, by="chr_pos")
I thought that the dplyr package could fix the memory limit issue, but I'm getting this error now:
Error: cannot allocate vector of size 127613.3 Gb
Execution halted
This is the size of the dataframes I want to merge:
[1] "Number of rows in dat1: 3571272"
54.5 Mb
[1] "Number of rows in dat2: 9592079"
146.4 Mb
Have you run into this error when running Epinano_DiffErr.R, and if so what was your solution?
PS: It would also be great to be able to pass the 5 sum_err columns at the same time as was suggested in #122
Hi @GeoffLyle sorry for the slow reply. Were you able to solve this issue? Also, thanks for your suggestion on using 5sum_err columns, we will keep this in mind for future updates.
I have been running into an issue trying to run Epinano_DiffErr.R on the output from Epinano_sumErr.py.
Running Epnano_sumErr.py appears to work:
Epinano_sumErr.py --quality --file NHA_hTERT_DRNA_20220609_self_transcript_aligned.sorted.plus_strand.per.site.5mer.csv --out NHA-hTERT_5mer.sum_err.csv --kmer 5
python3 Epinano_sumErr.py --quality --file full_fq_to_sample_transcripts_output.sorted.plus_strand.per.site.5mer.csv --out DIPG-IV_5mer.sum_err.csv --kmer 5
However, when I try to run this output through Epinano_DiffErr.R I run into the following error:
Rscript Epinano_DiffErr.R -k NHA-hTERT_5mer.sum_err.csv -w DIPG-IV_5mer.sum_err.csv -c 30 -d 0.1 -t 3 -o DIPG-IV_NHA-hTERT_5mer_sumErr --feature sum_err3
Error:
This appears to be due to a memory limit issue.
Note: I also tried changing line 126 in Epinano_DiffErr.R:
combine <- merge(dat1, dat2, by="chr_pos")
to:combine <- dplyr::full_join(dat1, dat2, by="chr_pos")
I thought that the dplyr package could fix the memory limit issue, but I'm getting this error now:
This is the size of the dataframes I want to merge: [1] "Number of rows in dat1: 3571272" 54.5 Mb [1] "Number of rows in dat2: 9592079" 146.4 Mb
Have you run into this error when running Epinano_DiffErr.R, and if so what was your solution?
PS: It would also be great to be able to pass the 5 sum_err columns at the same time as was suggested in #122