natekupp / ffx

Fast Function Extraction
http://trent.st/ffx
Other
80 stars 93 forks source link

Error on some data #16

Closed zegkljan closed 8 years ago

zegkljan commented 8 years ago

We want to use FFX in a comparative study but we ran into a problem: for some data, in STEP 2 there are somehow 0 bases (STEP 2: Regress on all 0 bases: begin.) and in this case the pathwise learn fails because of an empty array:

...
Build with approach 2/7 (inter1 denom0 expon0 nonlin0 thresh1): begin
  STEP 1A: Build order-1 bases: begin
  STEP 1A: Build order-1 bases: done.  Have 65 order-1 bases.
  STEP 1B: Find order-1 base infls: begin
    Pathwise learn: begin. max_num_bases=65
      alpha 1/1249 (4.091439e-02): num_bases=0, nmse=0.009177, time 1.83 s
    Pathwise learn: Early stop because nmse < target
  STEP 1B: Find order-1 base infls: done
  STEP 1C: Build order-2 bases: begin
  STEP 1C: Build order-2 bases: done.  Have 0 order-2 bases.
  STEP 2: Regress on all 0 bases: begin.
    Pathwise learn: begin. max_num_bases=250
Traceback (most recent call last):
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/code/run.py", line 205, in <module>
    main()
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/code/run.py", line 122, in main
    verbose=True)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/api.py", line 4, in run
    return core.MultiFFXModelFactory().build(train_X, train_y, test_X, test_y, varnames, verbose)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 443, in build
    next_models = FFXModelFactory().build(train_X, train_y, ss, varnames, verbose)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 653, in build
    ss, varnames, bases, X, y, ss.final_max_num_bases, ss.final_target_train_nmse, verbose)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 683, in _basesToModels
    max_num_bases, target_train_nmse, verbose)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 729, in _pathwiseLearn
    max_iter=max_iter, **fit_params)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 841, in new_f
    result = f(*args, **kwargs)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/ffx/core.py", line 853, in fit
    return ElasticNet.fit(self, *args, **kwargs)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/scikits/learn/linear_model/coordinate_descent.py", line 122, in fit
    beta, Gram, Xy, y, max_iter, tol)
  File "cd_fast.pyx", line 225, in scikits.learn.linear_model.cd_fast.enet_coordinate_descent_gram (scikits/learn/linear_model/cd_fast.c:2518)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 2072, in norm
    return abs(x).max(axis=axis)
  File "/mnt/data/gandalv/School/PhD/Research/sr-comparison/ffx/ffx-venv/local/lib/python2.7/site-packages/numpy/core/_methods.py", line 26, in _amax
    return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity

The training data can be downloaded here, the last column is the y-value (i.e. the target value). Testing data are identical to training data.

Is there any possibility this could be resolved? FFX will get its citation :).

jmmcd commented 8 years ago

Thanks for the report. I didn't get exactly the same error as you (we must be out of sync?) but I got another error related to having zero bases. Let me know if this works.