PhonologicalCorpusTools / SLPAA

5 stars 0 forks source link

export a corpus as a human-readable dictionary #308

Closed kvesik closed 2 months ago

kvesik commented 2 months ago

Kathleen requested a minimal-effort export of a small corpus into human-readable form (dictionary, json, etc) so she can see what it looks like and decide whether it's something we want to have available for LREC.

Refer to issue #71 for bigger-picture thoughts about exporting data.

kvesik commented 2 months ago

@kchall please run branch 308, open (or create) a small corpus, and choose "export corpus" from the "analysis functions (beta)" menu.

The output is human-readable, but contains a LOT of extraneous information (eg internal state-tracking variables, ALL entries for movement tree even if they're not checked, ALL "added info" right-click menus even if they're not populated, etc).

If this general format is an acceptable starting point, though, it's quite simple (if not necessarily quick) to check each python class in the project (or at least the more complex ones, that have many variables not of interest to a user) and add a utility method that specifies which instance variables should actually be included in the export.

kchall commented 2 months ago

@kvesik Cool. So, just to make sure I'm understanding: we could basically pre-filter it so that the .txt file contains only the variables marked as 1/true, rather than everything? If so, then yes, that's probably worth it. Thanks!