syvwlch / Data-Ignota

A data-driven exploration of Ada Palmer's Terra Ignota series
https://syvwlch.github.io/Data-Ignota/
MIT License
3 stars 0 forks source link

[Feature] Bundle Data Into an R Data Package #12

Open syvwlch opened 2 years ago

syvwlch commented 2 years ago

Once the data set is more mature, consider creating an R data package to make it easier to share/use with the R community?

syvwlch commented 2 years ago

Research: https://rstudio4edu.github.io/rstudio4edu-book/data-pkg.html

cdrigby commented 2 years ago

So this would serve as a template others could apply to their own, possibly copyright-constrained, literature?

syvwlch commented 2 years ago

Well, the data package would contain the data pulled from Terra Ignota in a format that makes it easy to ingest and work with in R. So it would make analysis of that specific text possible without forking this repo or downloading the individual csv files and reading the file descriptions.

If I do end up writing some reusable code for the analysis or visualization part of the project, I could release that for those who want to apply to their own texts. Not sure how useful it would be, as the heavy-lifting is the XML markup that's going on in the private repo.

cdrigby commented 2 years ago

OK, that makes sense. Mentally I pictured the private repo as providing just a character stream of the text that you were parsing. I don't own the book so I should not access that side of things.

syvwlch commented 2 years ago

Private side:

Public side:

cdrigby commented 2 years ago

OK, the details of how you split it up make sense now that I see it.