Open NoraLoose opened 1 year ago
I agree. We cannot expect our audience to know how to train a model on this new dataset. I train ML models on other datasets, but the dataset really makes a big difference to the workflow. I consider myself as the target audience, but I would not know how to quickly train such a model. (In fact, I would expect that this would take days to weeks if I had to do it from scratch.)
(Alternatively, if it were to be catered to a more beginner audience, the Quickstart Step 3 could simply describe how to replicate the paper results by running the baseline_models/ (or one of them) against the subsampled training data?)
Yes, this is a good suggestion.
I think the quickstart is in a condition ready for testing.
@jerrylin96 did you close this issue on purpose? I have not completed the testing of the quickstart. In fact, I was waiting on instructions, see https://github.com/leap-stc/ClimSim/issues/55#issuecomment-1693874835.
Step 3 of the Quickstart reads:
https://github.com/leap-stc/ClimSim/blob/f94b862a37a9b7e2b9dc700c5c18d665124a2078/README.md?plain=1#L81-L83
Is it reasonable to expect that our target audience will know how to do this? As someone who has never trained an ML model before, I personally don't know where to begin, but I don't think I'm the target audience.
(Alternatively, if it were to be catered to a more beginner audience, the Quickstart
Step 3
could simply describe how to replicate the paper results by running thebaseline_models/
(or one of them) against the subsampled training data?)If the expectation is that the target Quickstart audience knows how to train a model against this data without further instruction, then I'll just have to recuse myself from being a tester for this section until I know how to do that (which probably won't be today)! 😄