biolab / orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
https://orangedatamining.com
Other
4.85k stars 1.01k forks source link

SOM Error #6254

Closed lucixhub closed 1 year ago

lucixhub commented 1 year ago

What's wrong?

Error when changing input data to SOM

Traceback (most recent call last):
  File "C:\Users\aljaz\AppData\Local\Programs\Orange\lib\site-packages\Orange\widgets\unsupervised\owsom.py", line 694, in run
    self.som.fit(self.data, N_ITERATIONS,
  File "C:\Users\aljaz\AppData\Local\Programs\Orange\lib\site-packages\Orange\projection\som.py", line 54, in fit
    self.init_weights_pca(x)
  File "C:\Users\aljaz\AppData\Local\Programs\Orange\lib\site-packages\Orange\projection\som.py", line 27, in init_weights_pca
    pc_length, pc = np.linalg.eig(np.cov(x.T))
  File "<__array_function__ internals>", line 5, in eig
  File "C:\Users\aljaz\AppData\Local\Programs\Orange\lib\site-packages\numpy\linalg\linalg.py", line 1318, in eig
    _assert_finite(a)
  File "C:\Users\aljaz\AppData\Local\Programs\Orange\lib\site-packages\numpy\linalg\linalg.py", line 209, in _assert_finite
    raise LinAlgError("Array must not contain infs or NaNs")
numpy.linalg.LinAlgError: Array must not contain infs or NaNs

How can we reproduce the problem?

err.zip Paint Data -> SOM(PCA, network 10x10) -> Data Table Open Paint Data, add exactly 1 (put) point, SOM throws an error.

What's your environment?

janezd commented 1 year ago

You've find yourself a perfect task for your introduction to coding Orange. :)

I suggest you do the following:

  1. Add a test in which SOM gets a single data point. See the existing tests and modify one of them (say test_missing_one_row_data) to pass data with a single row to the widget. (If you have some table, say self.iris, use self.iris[:1] to get a table that only contains the first row.) Such tests should fail. (I tried and it does.)
  2. Then write another test in which you have multiple rows, but every row except has some missing value. I suppose this will fail as well.
  3. Then fix the widget. See class Error and search for self.Error within code to understand how errors work. There is, in fact, a place that already reports a similar error (no rows without missing values). I suppose that you could simply modify this error message (and the corresponding check that triggers it) to something like "SOM needs at least two data rows without missing values". Also rename the message from no_defined_rows to not_enough_data.
  4. The tests should no longer fail.
  5. Now improve the tests: they should check that the error is shown. See how other tests do it. Also check that error messages disappear if the widget then receives proper data. Continue by triggering the error again (by using faulty data) and then check that the error disappears when the widget is given None as data. (This shouldn't be a problem if you're fix is indeed going to consist just of minor modification of the existing code.

If you need any help, ask.