FluxML / fluxml.github.io

Flux Website
https://fluxml.ai
MIT License
20 stars 45 forks source link

Getting Started example confusion #101

Closed chipbuster closed 2 years ago

chipbuster commented 2 years ago

Hey all. This was brought to my attention by this SO question, and I'm not 100% convinced that I'm right about what happens here, so please bear with me.

It looks like what the example on "Getting Started" wants to do is define a linear model from R^5 onto R^2, and then train that with a single example (x, y), where x is in R^5 and y is in R^2.

However, when we do the line data = zip(x,y), we generate a length-2 iterator containing [(x[1], y[1]), (x[2], y[2])] and ignore the other 3 elements of x. This is then fed into the loss function, which is still capable of computing an answer because of the elementwise operations. So what we actually end up doing is training a function from R^1 to R^1, with two examples.

I can replace x = rand(5) with x =vcat(rand(2), [missing, missing, missing]) and the whole tutorial still runs without a hitch, which seems to confirm that the last three elements of x are never examined.

Is this intended behavior, or was the line intended to be something like data = zip(eachcol(x), eachcol(y))?

ToucheSir commented 2 years ago

Good catch, either the zip should be removed to leave data = (x, y), or loss(d...) should be changed to something like loss(x, y). Do you mind filing a quick PR?

chipbuster commented 2 years ago

Will be happy to once I get home tonight.

On Thu, Sep 23, 2021, at 17:33, Brian Chen wrote:

Good catch, either the zip should be removed to leave data = (x, y), or loss(d...) should be changed to something like loss(x, y). Do you mind filing a quick PR?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/FluxML/fluxml.github.io/issues/101#issuecomment-926211439, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDELSDAFV5MEBLV6XF5DCDUDOTM5ANCNFSM5EUVU7TA.

-- Cheers, K. Song

chipbuster commented 2 years ago

Would y'all be open to a PR that generates noisy samples from a linear model and then tries to recover that model for the tutorial?

Right now this tutorial seems only to show you how to write the code to train a model, but doesn't show you what to do once you've trained it or even what's being trained. I understand wanting to keep the first article on the subject as minimal as possible, so if you'd prefer that in a side post or somewhere else entirely, let me know.

EDIT: To be clear, this would come later. PR #102 fixes this issue.

ToucheSir commented 2 years ago

That would be great :slightly_smiling_face:. Feel free to clean up anything you feel is confusing as well.