coolbutuseless / zstdlite

Fast, configurable in-memory compression of R objects with zstd
Other
26 stars 0 forks source link

Can this library be used to import ZST files? #3

Closed swaheera closed 7 months ago

swaheera commented 2 years ago

I am working with the R programming language. I am trying to download the smallest file from this website (https://files.pushshift.io/reddit/comments/), i.e. https://files.pushshift.io/reddit/comments/RC_2005-12.zst . My goal is to import this file into R and then query this file to find comments containing certain terms. For example, I want to find every comment that contains the word "tacos".

I have downloaded this file on to my computer, now I would like to try and import this file into R. I have never heard or worked before with this file extension format. I tried to read on the internet how might I be able to import this file into R., but I am not sure as to how this can be done

Does anyone know how I can import this zst file into R and then query it for specific search terms (e.g. "basketball")?

Thanks!

Note : This is my session info:

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252 LC_NUMERIC=C                    LC_TIME=English_Canada.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] htm2txt_2.2.2         dplyr_1.0.9           RedditExtractoR_2.1.5

loaded via a namespace (and not attached):
 [1] tinytex_0.40      tidyselect_1.1.2  xfun_0.30         remotes_2.4.2     purrr_0.3.4       vctrs_0.4.1       generics_0.1.3    testthat_3.1.4    usethis_2.1.6    
[10] htmltools_0.5.2   yaml_2.3.5        utf8_1.2.2        rlang_1.0.2       pkgbuild_1.3.1    pillar_1.7.0      glue_1.6.2        withr_2.5.0       DBI_1.1.3        
[19] sessioninfo_1.2.2 lifecycle_1.0.1   visNetwork_2.1.0  devtools_2.4.3    htmlwidgets_1.5.4 memoise_2.0.1     evaluate_0.15     knitr_1.39        callr_3.7.0      
[28] fastmap_1.1.0     ps_1.6.0          curl_4.3.2        fansi_1.0.3       cachem_1.0.6      desc_1.4.1        pkgload_1.2.4     jsonlite_1.8.0    fs_1.5.2         
[37] brio_1.1.3        digest_0.6.29     processx_3.5.3    RJSONIO_1.3-1.6   rprojroot_2.0.3   cli_3.3.0         tools_4.1.3       magrittr_2.0.2    tibble_3.1.6     
[46] crayon_1.5.1      pkgconfig_2.0.3   ellipsis_0.3.2    prettyunits_1.1.1 assertthat_0.2.1  rmarkdown_2.14    rstudioapi_0.13   R6_2.5.1          igraph_1.2.11    
[55] compiler_4.1.3