apple / turicreate

Turi Create simplifies the development of custom machine learning models.
BSD 3-Clause "New" or "Revised" License
11.19k stars 1.14k forks source link

Boosted trees classifier crash for 100k plus entries #1274

Open allenc84 opened 5 years ago

allenc84 commented 5 years ago

Hi, I'm running a simple Boosted Tree Classifier on a dataset that has 5 features.

It seems to run fine for small datasets (up to 50k entries), however, once I start to get to 100k and up it begins to crash.

Problem is, my dataset is 13M lines, so 100k is a tiny subset of the data I'd like to go into the ml model (to export and use in Core ML).

I'm using a high memory instance on google cloud (n1-highmem-64), which has 64 CPU cores, 416 GB of memory, and 200 GB of storage.

Any idea what the issue is, and how I can get turicreate to build a model for a dataset w/ 13M lines?

Alternatively, any guidance on how to utilize the GPU optimized instances of google cloud?

Parsing completed. Parsed 99999 lines in 0.112287 secs.
PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

Boosted trees classifier:
--------------------------------------------------------
Number of examples          : 75899
Number of classes           : 505
Number of feature columns   : 5
Number of unpacked features : 5
+-----------+--------------+-------------------+---------------------+-------------------+---------------------+
| Iteration | Elapsed Time | Training Accuracy | Validation Accuracy | Training Log Loss | Validation Log Loss |
+-----------+--------------+-------------------+---------------------+-------------------+---------------------+
| 1         | 13.308831    | 0.057563          | 0.051994            | 5.327710          | 5.400651            |
| 2         | 22.218108    | 0.061872          | 0.054770            | 4.877129          | 4.990599            |
| 3         | 31.183954    | 0.066206          | 0.054013            | 4.679235          | 4.826240            |
| 4         | 40.197789    | 0.069579          | 0.055528            | 4.541399          | 4.710229            |
| 5         | 49.406107    | 0.073506          | 0.059061            | 4.435775          | 4.627263            |
| 7         | 68.160059    | 0.078789          | 0.059061            | 4.285475          | 4.513437            |
| 8         | 77.481043    | 0.081121          | 0.060828            | 4.229251          | 4.474643            |
| 9         | 86.975972    | 0.083005          | 0.062342            | 4.181239          | 4.443032            |
| 10        | 96.488387    | 0.085535          | 0.061585            | 4.139843          | 4.416114            |
+-----------+--------------+-------------------+---------------------+-------------------+---------------------+
Traceback (most recent call last):
  File "weightReccor_classifier.py", line 17, in <module>
    results = model.evaluate(test_data)
  File "/home/allen/.local/lib/python2.7/site-packages/turicreate/toolkits/classifier/boosted_trees_classifier.py", line 218, in evaluate
    metric=metric)
  File "/home/allen/.local/lib/python2.7/site-packages/turicreate/toolkits/_supervised_learning.py", line 158, in evaluate
    dataset, missing_value_action, metric);
  File "/home/allen/.local/lib/python2.7/site-packages/turicreate/extensions.py", line 290, in <lambda>
    ret = lambda *args, **kwargs: self.__run_class_function(name, args, kwargs)
  File "/home/allen/.local/lib/python2.7/site-packages/turicreate/extensions.py", line 274, in __run_class_function
    ret = self._tkclass.call_function(fnname, argument_dict)
  File "turicreate/cython/cy_model.pyx", line 35, in turicreate.cython.cy_model.UnityModel.call_function
  File "turicreate/cython/cy_model.pyx", line 40, in turicreate.cython.cy_model.UnityModel.call_function
IndexError: map::at
TobyRoseman commented 5 years ago

@allenc84 sorry you are having this issue. I've gotten similar errors a few times. For me each time I couldn't reproduce the issue. Do you get the above error consistently?

znation commented 5 years ago

@hoytak I saw something similar to this not too long ago. I think you may have fixed this issue recently? (Not sure if the fix was in a branch, or went into master at any point.)