redsoxfan0219 / doc-app

0 stars 0 forks source link

Feature Planning: Glossary #2

Open lharpercannon opened 11 months ago

lharpercannon commented 11 months ago

Objective

Source in and read terms from a .json or .yml glossary

Subtask 1: Picking Glossary Format

JSON

Pros and cons & other relevant details to go here.

YML

Pros and cons & other relevant details to go here

lharpercannon commented 11 months ago

Brainstorming

Use glossary to pull in methodology-specific template elements

Description

  1. Pull in a blank table that is pre-configured with the columns you'd need for documentation. E.g. for gradient boosting machines, columns for things like: hyperparameter/parameter name, description, range of values tested, rationale, etc
  2. Use the Glossary to programmatically read in information about the hyperparameters you've selected--this could tie into one or more of the other functions....(a) injecting content into existing doc and/or (b) adding tables systematically

Wouldn't the normal templating be enough to handle stuff like this?

In this case, the real utility comes from the ability to create template components for methodologies (or other concepts) that would require different types of information, different types of help-text, etc. The user would not have to read a million versions of a methodology template, trying to figure out which one pertains to the model at hand. Instead, you can tell it to give you the necessary instructions for the methodology you've chosen.

This utility would also lessen the cognitive load w/ documenting methodologies that were considered but were not ultimately chosen. That is, if you've decided to go with gradient boosting, the majority of your prod and dev methodology details will be organized around the documentation needs of that final methodology. It can be difficult to adequately cover the relevant details for candidate methodologies if they differ significantly from the final method.

lharpercannon commented 11 months ago

As I think about how a glossary feature would need to work, I'm reminded of something the game director for Night in the Woods said, to the effect of: "If I could start it over again, I wouldn't hard code the dialogue for our game. I would use a tool like YarnSpinner or put it into excel so it's easier to change out as needed."

Obviously we're very familiar with this idea, but it has me thinking--video games very frequently have to use databases to organize game assets. Sometimes those assets include things like: lines of dialogue, text found on in-game objects, 'key' columns indicating the triggers that have to occur for a specific text asset to appear in-game (or, to rephrase, columns that help situate one asset within the larger codebase & help situate any given game system within the superset of all game systems).

Which is to say, I would like to see if I can dig anything up about how developers handle that sort of information flow--it may be easier to take strategies as-is from that domain than to pull from data science, given that games almost always take a final form of .exe

lharpercannon commented 11 months ago

@redsoxfan0219 ok, did some noodling on this--lmk initial thoughts

redsoxfan0219 commented 11 months ago

@lharpercannon

Re: format of text repository: Let's not overcomplicate the POC. Simpler is better to start. YML is more human-readable, especially for non-developers (part of the audience I need to consider). JSON would be my second option. Let's only go to a database if/when the yml/json file gets so big that it presents a problem for loading the entire thing in memory. I'm guessing that'll be a while for us, if we ever get to that point.

How do you imagine a glossary call working in the text of the template Word document? Could something like ``{{ glossary: 'xgboost' }} work?