Hello I've detected an extrange behaviour at mut.to.sigs.input function, this behaviour generates an out of memory error (even with 54GB!) when parsing big files at beep loop.
This is the affected code:
for (i in unique(mut[, sample.id])) {
tmp = subset(mut, mut[, sample.id] == i) #Failing line
beep = table(tmp$tricontext)
for (l in 1:length(beep)) {
trimer = names(beep[l])
if (trimer %in% all.tri) {
final.matrix[i, trimer] = beep[trimer]
}
}
}
What I've seen is, when I was going to execute the substep line the size of selected rows was squared. For example, when perorming a subset of 100 samples ( and 10 columns), the tmp matrix dimensions were 10000x10 (!) instead of expected 100x10 one.
I've checked 3 different ways to perform the same operation and in all the behaviour is the expected.
I suggest you could try to implement "tmp2" or "tmp4" solutions.
Hello I've detected an extrange behaviour at mut.to.sigs.input function, this behaviour generates an out of memory error (even with 54GB!) when parsing big files at beep loop.
This is the affected code:
What I've seen is, when I was going to execute the substep line the size of selected rows was squared. For example, when perorming a subset of 100 samples ( and 10 columns), the tmp matrix dimensions were 10000x10 (!) instead of expected 100x10 one.
I've checked 3 different ways to perform the same operation and in all the behaviour is the expected. I suggest you could try to implement "tmp2" or "tmp4" solutions.
Thank you!