ml5js / ml5-next-gen

Repo for next generation of ml5.js: friendly machine learning for the web! 🤖
https://ml5js.org/
Other
84 stars 23 forks source link

`normalizeData()` does not handle only zeros in one of the inputs to the neural net #204

Closed jchwenger closed 1 month ago

jchwenger commented 1 month ago

Hi there,

I've been noticing what seems to be a little bug.

If you go to this example and comment out all examples for one class, e.g. all the "blue-ish" objects, training fails with infinite loss, because the normalized data has NaNs in the third coordinate of xs (check classifier.data). I bumped into this in this sketch, where again I would like this to work even if only two classes are provided. As long there's nonzero numbers everywhere in the training input, it's fine, but normalizing zeros doesn't seem to work. I wonder where you would want to add a check for this: directly in normalizeDataRaw?

ziyuan-linn commented 1 month ago

Hi @jchwenger,

Thank you again!

It seems like the code uses (value - min) / (max - min) to compute the normalized values. This will divide by zero if min == max, which is the source of the NaN. When all values of a parameter are the same, the bug will occur.

I opened PR #208 to address this.

From local testing, the linked sketch now works even if not all classes are provided.

Screenshot 2024-09-23 at 6 10 07 PM
jchwenger commented 1 month ago

Hi @ziyuan-linn, awesomeness! Nice thinking, this check, looking forward to the pr being integrated!