Open lharpercannon opened 11 months ago
In this case, the real utility comes from the ability to create template components for methodologies (or other concepts) that would require different types of information, different types of help-text, etc. The user would not have to read a million versions of a methodology template, trying to figure out which one pertains to the model at hand. Instead, you can tell it to give you the necessary instructions for the methodology you've chosen.
This utility would also lessen the cognitive load w/ documenting methodologies that were considered but were not ultimately chosen. That is, if you've decided to go with gradient boosting, the majority of your prod and dev methodology details will be organized around the documentation needs of that final methodology. It can be difficult to adequately cover the relevant details for candidate methodologies if they differ significantly from the final method.
As I think about how a glossary feature would need to work, I'm reminded of something the game director for Night in the Woods said, to the effect of: "If I could start it over again, I wouldn't hard code the dialogue for our game. I would use a tool like YarnSpinner or put it into excel so it's easier to change out as needed."
Obviously we're very familiar with this idea, but it has me thinking--video games very frequently have to use databases to organize game assets. Sometimes those assets include things like: lines of dialogue, text found on in-game objects, 'key' columns indicating the triggers that have to occur for a specific text asset to appear in-game (or, to rephrase, columns that help situate one asset within the larger codebase & help situate any given game system within the superset of all game systems).
Which is to say, I would like to see if I can dig anything up about how developers handle that sort of information flow--it may be easier to take strategies as-is from that domain than to pull from data science, given that games almost always take a final form of .exe
@redsoxfan0219 ok, did some noodling on this--lmk initial thoughts
@lharpercannon
Re: format of text repository: Let's not overcomplicate the POC. Simpler is better to start. YML
is more human-readable, especially for non-developers (part of the audience I need to consider). JSON would be my second option. Let's only go to a database if/when the yml/json file gets so big that it presents a problem for loading the entire thing in memory. I'm guessing that'll be a while for us, if we ever get to that point.
How do you imagine a glossary call working in the text of the template Word document? Could something like ``{{ glossary: 'xgboost' }} work?
Objective
Source in and read terms from a .json or .yml glossary
Subtask 1: Picking Glossary Format
JSON
Pros and cons & other relevant details to go here.
YML
Pros and cons & other relevant details to go here