FluxML / fluxml.github.io

Flux Website
https://fluxml.ai
MIT License
20 stars 45 forks source link

docs: Changes to the Getting Started document #103

Closed chipbuster closed 2 years ago

chipbuster commented 2 years ago

This PR updates the "Getting Started" document with a more fleshed-out tutorial. Instead of running a single training step, it creates a model, generates noisy examples, and then attempts to learn the original model via examples. This shows new users what they have done, and how they can use the result of the training process.

Potential Issues

According to BenchmarkTools, the old tutorial took a few milliseconds to run and consumed 18.5 KiB of memory, while the new one takes about 100ms to run and consumes 71.06 MiB of memory.

BenchmarkTools excludes compilation time, and in practice when I ran the two from the command line, the runtime appeared to be overwhelmingly dominated by the compilation time (approximately 20 seconds to complete each one), but the memory aspect may be of concern, depending on where users are expected to try this tutorial. I don't expect that this is an issue, since I couldn't even compile Flux on a VPS with 1GB RAM due to memory limits, but I don't know how this library is used. If 71MB of memory is a concern, I can attempt to cut down on the size of the training set at the cost of worse accuracy.

Also, in spite of the fact that I've written a SGD implementation or two, I'm actually not terribly familiar with the field of ML and am not particularly fluent with how things are usually talked about/described. If anyone would like wording or vocabulary changes to make things more idiomatic, I'd be happy to change those.

DhairyaLGandhi commented 2 years ago

Thanks for taking a look, generally pretty good ideas! It is considered good practice to remove the compilation time while benchmarking.

We probably want to keep things as simple as possible for a getting started guide. I do believe we have some guides like in the PR too.

chipbuster commented 2 years ago

Definitely agreed about the simplicity. I think the older document did a great job in being as simple as possible with the training of the model, but I think it was potentially unclear about what exactly you were training or what you could do with it afterwards.

If there are any areas of this PR that you think should be cut or moved elsewhere, please do point them out! Striking the right balance between enough detail and too much detail is always a challenge.

chipbuster commented 2 years ago

And I need to do a pass to make the bulk code at the end match these changes. Might take me a minute since I'm not on my normal setup right now.

chipbuster commented 2 years ago

@ToucheSir Sorry for the delay.

ToucheSir commented 2 years ago

No need to apologize, this is a pretty fast turnaround for a docs PR :). Let me make a couple tweaks and we'll be good to merge.