kaz-Anova / StackNet

StackNet is a computational, scalable and analytical Meta modelling framework
MIT License
1.32k stars 344 forks source link

issue with regression prediction? #11

Open Data-drone opened 7 years ago

Data-drone commented 7 years ago

when I have the test_file specified in train task it works but when run predict separately with task=regression I get:

Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.ClassCastException: ml.stacknet.StackNetRegressor cannot be cast to ml.stacknet.StackNetClassifier at stacknetrun.runstacknet.main(runstacknet.java:775) ... 5 more

kaz-Anova commented 7 years ago

Yes . I noticed that too. This is an bug - I will fix it.

Data-drone commented 7 years ago

I'm not that experienced in java but is it that it's throwing a ClassCastException which isn't handled in ur try catches around line 775

kaz-Anova commented 7 years ago

That is right. I missed this case :/ . It is a small fix. I will update today.

molecularswords commented 7 years ago

I also get this error but with classification of a (very) small test data set and with larger data sets:

Loaded dense test data with 30 and columns 4 loading test data lasted : 0.003000 prediction to has failed due to 3 printing prediction to pred.csv has failed due to null predicting on test data lasted : 0.003000 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.NullPointerException at stacknetrun.runstacknet.main(runstacknet.java:724) ... 5 more

I get this also when omitting the preds argument as well as when I remove the target from the test data and changing the test_target to false. This seems to be related to #5 .

kaz-Anova commented 7 years ago

@molecularswords Ok I got. I will fix it in the next release. Thank you for letting me know.

danieleewww commented 7 years ago

# "Likely printing pred2.csv has some size limitation!"

Ran OK With the following commands: java -Xmx18048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=DS3_NikunjPP model=model2 pred_file=pred2.csv train_file=xgbfirtop50_nkjDS_train.txt test_file=xgbfirtop50_nkjDS_test.txt test_target=false params=opt_lgbm_12m_lb0642250_lr024nitune.txt verbose=true threads=4 metric=mae stackdata=false seed=1 folds=6 bins=3 Loaded sparse test data with 2985217 and columns 149 (size: 7.39GB) loading test data lasted : 524.995000 Printing reusable test for level: 1 as : DS3_NikunjPP_test1.csv Completed: 5.00 % Completed: 10.00 % Completed: 15.00 % Completed: 20.00 %

But failed with the following: java -Xmx18048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=DS3_NikunjPP model=model2 pred_file=pred2.csv train_file=xgbfir_nkjDS_train.txt test_file=xgbfir_nkjDS_test.txt test_target=false params=opt_lgbm_12m_lb0642250_lr024nitune.txt verbose=true threads=4 metric=mae stackdata=false seed=1 folds=5 bins=3 Loaded sparse test data with 2985217 and columns 246 (size:12.81GB) loading test data lasted : 1001.252000 prediction to has failed due to null printing prediction to pred2.csv has failed due to null predicting on test data lasted : 0.066000 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.NullPointerException at stacknetrun.runstacknet.main(runstacknet.java:713) ... 5 more

Likely printing pred2.csv has some size limitation!

Updated on test datasize: 9.92GB failed at printing prediction and 8.85 passed at printing prediction

  1. Loaded sparse test data with 2985217 and columns 196 (9.92GB) loading test data lasted : 699.206000 prediction to has failed due to null printing prediction to pred2.csv has failed due to null predicting on test data lasted : 0.062000 Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.NullPointerException at stacknetrun.runstacknet.main(runstacknet.java:713) ... 5 more

  2. Number of elements : 502365687 Loaded sparse test data with 2985217 and columns 176 (8.85GB) loading test data lasted : 610.825000 Printing reusable test for level: 1 as : DS3_NikunjPP_test1.csv Completed: 5.00 % Completed: 10.00 % Completed: 15.00 % Completed: 20.00 % Completed: 25.00 % ......

  3. Loaded sparse test data with 2985217 and columns 190 (9.7GB) loading test data lasted : 662.149000 prediction to has failed due to null printing prediction to pred2_2017Q4.csv has failed due to null predicting on test data lasted : 0.107000

  4. Loaded sparse test data with 2985217 and columns 180 (9.1GB) loading test data lasted : 641.476000 Printing reusable test for level: 1 as : DS3_NikunjPP_test1.csv Completed: 5.00 % Completed: 10.00 % Completed: 15.00 % Completed: 20.00 % Completed: 25.00 %

kaz-Anova commented 7 years ago

Interesting . I have seen this error before, but I am not able to reproduce it. Do you have a small file that reproduces that?

There is no such limitation of size .

danieleewww commented 7 years ago

"prediction to has failed due to null" on the dataset test data with 2985217 and columns 246(size:12.81GB) were reproducible in my OSX and Ubuntu systems. Test data with 2985217 and columns 149 (size: 7.39GB) and other smaller data size(3-5 GB) have no such issue on both systems here.