Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
603 stars 352 forks source link

Date attribute error #248

Open mik3hall opened 2 years ago

mik3hall commented 2 years ago

Getting an error on a arff file date attribute

@relation R_data_frame

@attribute timestamp Date "dd MMMMM yyyy HH:mm:ss"

I believe this is the same format as shown in the ARFF specification.

Some data... "2017-12-31 18:01:00",2,40.0,2376.58,2399.5,2357.14,2374.59,19.23300519,2373.1163915061647,-0.004218152387429286 "2017-12-31 18:01:00",0,5.0,8.53,8.53,8.53,8.53,78.38,8.53,-0.014398966468964769

The error Failure reason: For input string: "2017-12-31 18:01:00" STACK TRACE java.lang.NumberFormatException: For input string: "2017-12-31 18:01:00" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.lang.Double.parseDouble(Double.java:538) at java.lang.Double.valueOf(Double.java:502) at com.yahoo.labs.samoa.instances.ArffLoader.readInstanceDense(ArffLoader.java:163) at com.yahoo.labs.samoa.instances.ArffLoader.readInstance(ArffLoader.java:130) at com.yahoo.labs.samoa.instances.Instances.readInstance(Instances.java:477) at moa.streams.ArffFileStream.readNextInstanceFromFile(ArffFileStream.java:157) at moa.streams.ArffFileStream.restart(ArffFileStream.java:148) at moa.streams.ArffFileStream.prepareForUseImpl(ArffFileStream.java:94) at moa.options.AbstractOptionHandler.prepareForUse(AbstractOptionHandler.java:86) at moa.options.OptionsHandler.prepareClassOptions(OptionsHandler.java:149) at moa.options.AbstractOptionHandler.prepareClassOptions(AbstractOptionHandler.java:151) at moa.tasks.AbstractTask.doTask(AbstractTask.java:52) at moa.tasks.TaskThread.run(TaskThread.java:76)

I've tried it with and without double quoting the date format. Looking at the source it appears no special date handling is being done and it is just assuming a numeric value.