fieldsoftheworld / ftw-baselines

Code for running baseline models/experiments with the Fields of The World dataset
https://fieldsofthe.world/
MIT License
47 stars 3 forks source link

Use --model_type parameter instead of just path to ckpt file in inference. #22

Open cholmes opened 1 week ago

cholmes commented 1 week ago

From @calebrob6:

Also, one thing that'd be really nice would be to get rid of the ckpt file path and just have a --model_type parameter to ftw inference run that lets you select from an enum of trained models and auto-downloads the checkpoint if it doesn't exist.

Originally posted by @cholmes in https://github.com/fieldsoftheworld/ftw-baselines/issues/21#issuecomment-2428063056

cholmes commented 1 week ago

Any thoughts on semantics @calebrob6? Would model_type just be like 2class_full, 2class_ccby, 3class_full, 3class_ccby? (better names welcome). Is there some clever way to get 'the latest' of each from github or hugging face? Or you're thinking we just hard code it and update when there's a release? Is there any concept of 'latest' or 'releases' on hugging face? Like can we point at one location and then if there are updates then we'll just get those? Just thinking aloud if there's ways to not have to hard code every release.

And I guess we'd let people have an option for the output path, but following the data download default to like a models/ directory relative to the current one (and then check there to see if there's one).

At some point it might be nice to have the ability to use / set an environment variable for like your 'ftw home dir', where the data and models download.

m-mohr commented 1 week ago

I guess that will be simpler if we work on #32 first?

hannah-rae commented 1 week ago

What about having a .md file with a table of the different models, a longer description of it, and the "short name" that we use as this parameter value? I think that would be useful when we start adding more model options (e.g. from the literature) and we can avoid having really long values to try to make them fully descriptive, since you can just refer to the look up table.

cholmes commented 1 week ago

I like the idea of having a definitive table of 'short name' with the longer description and the link to it.

A markdown file could work, though we maybe want something even simpler? (though the human readability of markdown is nice) I think the main goal is to have it be a definitive list that tools can rely upon. Perhaps we have 'versions' in there too / just link to the latest version.