Open GoogleCodeExporter opened 9 years ago
The matter is also how we will save the information of the dataset. We have a
class
called dataset. The attributes are:
name,hash,path,continuous,nominal,missing.
The boolean attribute adds information about the dataset. But is there some way
to take
this information by command line of Weka?
Wich attributes do we need more?
Original comment by illoqpa...@gmail.com
on 4 Jun 2010 at 10:47
Attachments:
"Weka knows the qualities but it uses the API to know it. How will we catch
this exception?"
Well, we can expect most client datasets to be valid arff/csv. But obviously
that
can't be trusted so we need a validator so that gogrid server don't get started
in
vain. In my old wonline code, dataset after uploading was 'parsed' so that it
must
contain certain data (@data, @attributes etc.) but the quicker and more
reliable way
is to use Weka API, e.g. to run "java weka.core.Instances dataset.arff"
interface,
and catch its exception: only if it's valid it prints out the constitution of
the
datasets (number of instances etc.). I don't know what it throws if it's not
but it
will be different.
The format requirement and this invoicing policy "you have to pay even if your
dataset is invalid" should make them format datasets properly.
Original comment by harri.sa...@gmail.com
on 4 Jun 2010 at 1:33
Original issue reported on code.google.com by
illoqpa...@gmail.com
on 4 Jun 2010 at 10:34