kaz-Anova / StackNet

StackNet is a computational, scalable and analytical Meta modelling framework
MIT License
1.32k stars 344 forks source link

Failed to read csv files #20

Closed arisbw closed 7 years ago

arisbw commented 7 years ago

Hello. Now I tried to use stacknet using csv files. But I got an error like this:

> java -Xmx3048m -jar StackNet.jar train task=classification train_file=train.vm2.csv test_file=test.vm2.csv params=param_amazon_linear.txt pred_file=linear_pred.csv test_target=false verbose=true Threads=4 sparse=true folds=5 seed=1 metric=auc has_head=true
parameter name : task value :  classification
parameter name : train_file value :  train.vm2.csv
parameter name : test_file value :  test.vm2.csv
parameter name : params value :  param_amazon_linear.txt
parameter name : pred_file value :  linear_pred.csv
parameter name : test_target value :  false
parameter name : verbose value :  true
parameter name : threads value :  4
parameter name : sparse value :  true
parameter name : folds value :  5
parameter name : seed value :  1
parameter name : metric value :  auc
parameter name : has_head value :  true
[0, 29943]
Exception in thread "main" java.lang.IllegalStateException: File train.vm2.csv  failed to import at bufferreader
        at io.input.readsmatrixdata(input.java:1327)
        at stacknetrun.runstacknet.main(runstacknet.java:425)

Something seems wrong with the dimension provided above. I also check my csv files which are comma delimited as suggested by stacknet (the true dimensions are 29943 rows x 319 columns for train and 318 columns for test).

kaz-Anova commented 7 years ago

@arisbw if your files are not in sparse (svmlight format), you need to set sparse=false

so if your file is target,feat1,feat2,feat3 ....feat318, you need sparse=false if your file is in this format (where we ignore the zero values) : target index1:feat1 index5:feat5 index17:feat17 indexk<=n:featk<=n you need sparse=true

arisbw commented 7 years ago

Ah, I forgot that! 😞 Thank you @kaz-Anova

sotiristsak commented 6 years ago

Hello. It seems that i'm having the same issue. i run: java -jar StackNet.jar train, task=classification model=model pred_file=predictions.csv has_head=true train_file=train_new.csv test_file=test_new.csv params=params.txt sparse=false

and i get: parameter name : task value : classification parameter name : model value : model parameter name : pred_file value : predictions.csv parameter name : has_head value : true parameter name : train_file value : train_new.csv parameter name : test_file value : test_new.csv parameter name : params value : params.txt parameter name : sparse value : false Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.lang.IllegalStateException: File train_new.csv failed to import at bufferreader at io.input.Readfmatrix(input.java:845) at stacknetrun.runstacknet.main(runstacknet.java:445)

train_new.csv head is like this:

target feat1 feat2 feat3 feat4 feat5 feat6 feat7 feat8 feat9 feat10 feat11 feat12 feat13 feat14 feat15 0 83230 3 1 13 379 14 6 1 1 1 NaN NaN NaN 14 0 17357 3 1 19 379 14 6 1 1 1 NaN NaN NaN 14 0 35810 3 1 13 379 14 6 1 1 1 NaN NaN NaN 14 0 45745 14 1 13 478 14 6 1 1 1 NaN NaN NaN 14 0 161007 3 1 13 379 14 6 1 1 1 NaN NaN NaN 14 untitled