fieldsoftheworld / ftw-baselines

Code for running baseline models/experiments with the Fields of The World dataset
https://fieldsofthe.world/
MIT License
32 stars 1 forks source link

CLI command updates? #21

Open cholmes opened 10 hours ago

cholmes commented 10 hours ago

Our current CLI hierarchy is a bit weird, and I'd like to refactor download_imagery.py to be a part of the CLI so we have a consistent interface, which makes things weirder.

ftw download - gets the data to run the models, etc. ftw unpack - unpacks the data you downloaded ftw model has three sub-commands - fit, test, and inference.

The logical place to add 'download imagery' would be right next to inference, something like ftw model download-images, but that feels buried, and it's not really related to the inference.

Also the top level download and unpack feel like they shouldn't be the same level as 'model', but instead should be grouped together. I also want to add the ability to download the fiboa field boundaries - see #20 (though I'm leaning towards putting that under the current download command.).

I'd like to propose a new structure:

ftw data, which would have ftw data download and ftw data unpack ftw model, which would have ftw model fit and ftw model test ftw inference which would have ftw inference create-input and ftw inference run (which would be the same as ftw model inference now)

We could also have ftw inference polygonize as its own command instead of tacked onto inference as is currently in #19. Then --simplify would be a bit clearer, since right now it could look like an option for the actual inference, not just the polygonization. I'm pretty sure I want to add a --output-format command to, so people could choose gpkg, geoparquet, geojson. And maybe a --fiboa flag that lets you control if you want to have it as fiboa output.

If we do things right then we might even be able to do ftw inference run ... | ftw inference polygonize for people who don't care about the intermediate raster - just pipe the output from running the inference to the polygonization'.

calebrob6 commented 9 hours ago

I like this proposed change!

My only question here is why ftw inference create-input instead of ftw inference download or ftw inference imagery

I theoretically like the piped output (or more so the idea of piping output), but for ease of access I'd wrap this type of "one command" function into an end to end set of args under the new fiboa inference --win_a ... --win_b ... --output_fn fields.gpkg.

Also, one thing that'd be really nice would be to get rid of the ckpt file path and just have a --model_type parameter to ftw inference run that lets you select from an enum of trained models and auto-downloads the checkpoint if it doesn't exist.

cholmes commented 8 hours ago

My only question here is why ftw inference create-input instead of ftw inference download or ftw inference imagery

I don't feel strongly, and didn't love what I came up with. Hrm, and actually I think download can make more sense after changing the other to ftw data download. I guess my only hesitation is that it's not so descriptive of what it's actually doing. Could be download imagery or get imagery. But I also like one word ones. I'll change it.

I theoretically like the piped output (or more so the idea of piping output), but for ease of access I'd wrap this type of "one command" function into an end to end set of args under the new fiboa inference --win_a ... --win_b ... --output_fn fields.gpkg.

I'm not sure what you're saying here? I think you said you liked having an independent command for polygonize. You're saying we'd have:

fiboa inference download --win_a ... --win_b ... --output_fn output.tif fiboa inference polygonize --input_fn output.tif And then fiboa inference --win_a ... --win_b ... --output_fn fields.gpkg ? And that one would call the other two? And would have less options / more defaults? Or would try to make sure that the flags for each of the subcommands are different so they could all be used together?

Also, one thing that'd be really nice would be to get rid of the ckpt file path and just have a --model_type parameter to ftw inference run that lets you select from an enum of trained models and auto-downloads the checkpoint if it doesn't exist.

That would be nice. I'll add a ticket for it. It'd be 2 class and 3 class and cc-by and full? Is there some clever way to get 'the latest' of each from github or hugging face? Or you're thinking we just hard code it and update when there's a release? (will add these questions on a new ticket).

cholmes commented 8 hours ago

Also, any strong feelings on --output_fn? I was going to align these with how the fiboa cli works, which does -o and --output, since I think there's some chance some subset of people will use both tools, and it'd be nice if they worked the same. I'd also never seen the _fn thing, I thought it meant 'function' initially...