eddelbuettel / rcppsimdjson

Rcpp Bindings for the 'simdjson' Header Library
116 stars 13 forks source link

Add a vignette and pkgdown web page #41

Closed melsiddieg closed 2 years ago

melsiddieg commented 4 years ago

I would like to volunteer a vignette and pkgdown site to highlight this amazing package

eddelbuettel commented 4 years ago

We have discussed the need, or lack thereof, for pkgdown and don't think we're there yet.

There is also "nothing to do" (in the sense of cookie-cutter-all-the-same sites) and a lot to do in terms of nicer css.

But we can talk about a vignette, and in fact, @knapply and I have. What did you have in mind?

melsiddieg commented 4 years ago

The documentation is already really good, however, I guess it would help to format it as a vignette with longer explanation and better-formatted examples. I have the CORD-19 dataset from Kaggle in mind as a use case as it contains about 150,000 JSON files comprising scientific papers about coronavirus which has been made available for NLP analysis. I think it would be a compelling use case. I found that I can extract paper abstracts from all those files in about 1 minute using Rcppsimdjson.

eddelbuettel commented 4 years ago

Sure.

"In principle" a vignette is meant to be re-ran at each package build. Downloading and parsing 150k files each time would be madness.

"In practice" one can also do static vignettes, and I often do so. All that said, we don't even have to put it into the package. This sounds like it would also be a nice use case for the Rcpp Gallery which is also markdown based. Maybe you would want to write a post there?

melsiddieg commented 4 years ago

@eddelbuettel that is true and would be better in a blogpost, However, I still think that the package examples would make for a good vignette.

eddelbuettel commented 2 years ago

Closing this for lack of follow-up. We'll write a vignette one day, I just did for another years-old package....