Open mvaudel opened 7 years ago
@mvaudel I can imagine a quick and dirty solution which would involve just uncompressing the file using gunzip
and then reading the resulting file in as a big.matrix
. I'm not sure otherwise about any R interface reading directly from gzfiles. If such an interface exists, then we could certainly explore it otherwise I think we will likely refer users to simply uncompress the file themselves (assuming other authors feel the same).
Thank you for your answer. It would be really convenient to read directly from the gzipped files because our files are quite huge so it is a substantial gain of time and space if we can read directly from them and deflate on the fly. Are you working on the files themselves or using a connection? For the latter if you can let us provide the connection directly instead of the file name, that should do the trick (https://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html).
Hey, check this function. You can found a vignette with more information.
This may not be super fast, but it is quite flexible.
Check all the arguments you need to specify, especially the file.nline
that you have to know explicitly, because the function can't compute it on a compressed file.
Any updates on this? I am trying to read a large .txt.gz
file that contains character/string data. I know fread
can read .txt.gz
files, but the file is larger than my available RAM. I can't use bigstatsr::big_read
because it does not support character type data.
Would it be possible to combine read.big.matrix
with fread
in some way, to support reading .gz
files?
Maybe this?
Cool, I see they have a workaround for reading .gz files, so this should work. Thanks!
Hi,
Thank you for this useful package. I use to read my matrices from text files using read.big.matrix. I was wondering whether it would be possible to support input from gzfiles?
Best regards,
Marc