Open dtgnn opened 2 years ago
Dear @dtgnn!
I am very happy to hear that you are using te package! And thank you very much! I realize the documentation here should be improved!
It is a little complex since the .copy
argument is used by the coder::copybig()
function but passed from coder::categorize()
via coder::codify()
before it gets there. Hence, it is not documented in ?categorize
but in ?copybig
and in ?codify
. Anyway, arguments passed from coder::categorize()
to coder::codify()
must be wrapped in a list such as: categorize(..., codify_args = list(.copy = <TRUE/FALSE>))
. This is because categorize()
can pass arguments both between its methods, as well as both to codify()
and set_classcodes()
(if x
is of class data.table
).
Please let me know, if this work or not!?
Thank you for your message, @eribul. With your input I now see that the .copy
argument is indeed listed in the codify()
help page... my bad! I'll try to amend my code and report back with the results.
Hello @eribul,
Just a quick update to say that I tried passing both options to codify()
, but neither seemed to handle my large dataframe well.
Using categorize(..., codify_args = list(.copy = FALSE))
produced the following error:
Error: cannot allocate vector of size 1.3 Gb
Error during wrapup: cannot allocate vector of size 1.4 Gb
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Using categorize(..., codify_args = list(.copy = TRUE))
brought the R session to eat up all my available memory (>100GB); I interrupted the process to avoid the session to crash.
I have resorted to slicing my dataset and iterating over the samples. It seems to do the job.
Thank you again for your help!
I am sorry to here that!
Is it possible, however, that you might be running a 32 bit version of R? If so, I might suspect that the 1.4 Gb limit might be caused by that, and not by your actual RAM. If you are unsure you can type R.version$arch
in the console to find out. (It is also stated on the third line of the start up message when you start R). If possible, I would sugest to use a 64 bit version of R.
And just to rule out the obvious; the > 100 GB is your RAM (not your disk memory) right? :-)
R version x86_64. 100GB of RAM.
Hi and thank you for your work on the
coder
package. I ran into issues while applying thecategorize
function to a fairly large dataframe (~4GB). The function returns the following error message:But there seems to be no way (judging from the documentation) to actually set the
copy
argument. I've tried including eithercopy = TRUE
or.copy = TRUE
to my calls tocategorize()
, in both cases without effects. Is there another way to address the issue?