cu-mkp / manuscript-object

Data extraction, transformation, manipulation, and analysis of the Making and Knowing Project's core dataset of the Digital Critical Edition, Secrets of Craft and Nature.
5 stars 3 forks source link

review priorities and larger goals of code #76

Open njr2128 opened 3 years ago

njr2128 commented 3 years ago

Go over open issues and goals in order to best set re-design/re-factoring steps

njr2128 commented 3 years ago

areas to consider:

njr2128 commented 3 years ago

manuscript-object code should enable ready use/creation of datasets that can then be used for visualization or analysis

njr2128 commented 3 years ago

A "package" - Separate place for "i want this data in this form" to use for other things

gschare commented 3 years ago

update.py (the process of generating and writing derivatives) should ideally be a simple, single Python script (with some switches and configuration) that simply requests from the "manuscript object" data structures with various transformations applied, e.g. "an array of all folios rendered in plain text", i.e. allFolios/txt.

Most importantly, the core manuscript object ought not to perform any "actions" (writing files) itself. Rather its entire functionality should be to return Python data structures, which may then be used by some script to actually write that data. There are design principles guiding why I propose this separation that I won't get into right now. Most salient of these, however, is that while manuscript object itself should be highly flexible, the scripts which use manuscript object must be allowed to ossify to some degree and serve narrow, specific purposes.

njr2128 commented 3 years ago

spectrum of visualization purposes/services:

njr2128 commented 3 years ago

building out jupyter notebooks