Closed giuseppec closed 8 years ago
PING.
related issue: E.g. http://www.openml.org/api_splits/get/7529/Task_7529_splits.arff is non valid arff format (see also https://github.com/openml/website/issues/25)
The example seems to be a valid list of nominal values in the arff format according to the developer description of the file format.
However, only the farff package fails here. RWeka::read.arff
is able to read this arff file.
farff
seems now to be able to read those data sets.
Is it reading them or just skipping them? On Tue, 1 Mar 2016 at 14:13, giuseppec notifications@github.com wrote:
Closed #216 https://github.com/openml/OpenML/issues/216.
— Reply to this email directly or view it on GitHub https://github.com/openml/OpenML/issues/216#event-573005061.
It is reading them. For both farff
and RWeka
, the R data.frame
is equivalent (exept for did = 73, where I get a java OutOfMemoryError
when using RWeka
, but this is not our problem and farff
is still able to read it).
Awesome :)
On Tue, Mar 1, 2016 at 2:40 PM giuseppec notifications@github.com wrote:
It is reading them. For both farff and RWeka, the R data.frame is equivalent (exept for did = 73, where I get a java OutOfMemoryError when using RWeka, but this is not our problem and farff is still able to read it).
— Reply to this email directly or view it on GitHub https://github.com/openml/OpenML/issues/216#issuecomment-190727622.
The ARFF files with ids 70, 71, 73 (maybe there are some more) seem wrong. Here the direct link to one of these datasets: http://www.openml.org/data/download/1716/BayesianNetworkGenerator_anneal_small.arff . In the header there is one line which causes an error with all arff reader that are available in R. The suspicious line is:
@attribute carbon {'\'B1of3\'','\'B2of3\'','\'B3of3\''}
is this a valid ARFF format?