ysminnpu / wekaonline

Automatically exported from code.google.com/p/wekaonline
0 stars 0 forks source link

cross-OS issues -> filter subsystem solution #4

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
It is planned that wronline serves both rapidminer and weka communities.
This will involve making wronline compatible with both software.

Filter subsystem, as outlined by Fernando as containing sql like qualities,
can provide a solution to do this. It will then be well beyond both
rapidminer and weka standard distributions, which do have sql I/O but
client has to write the actual sql queries (which is awkward).

I have no answers or a more formal specification, but the idea of "enabling
client to use anything in rapidminer or weka" is so good that it should be
developed further.

(Obviously, since rm contains weka, we could get with much less
compatibility-ensuring work, if we focused on rapidminer -> rapidonline,
despite the fact that current emphasis is to deploy "weka online" first).

Original issue reported on code.google.com by harri.sa...@gmail.com on 4 Jun 2010 at 1:49

GoogleCodeExporter commented 9 years ago
My idea is to use Weka to do filters. For example we have a file called 
cars.csv. 
When you apply a filter we will use a filter of Weka. 

first off we will have to convert the type of file to arff.

java weka.core.converters.CSVLoader cars.csv > cars.arff

next will be apply any filter to the file:

java weka.filters.unsupervised.instance.RemoveMisclassified cars.arff > 
cars_filtered.arff

the last step is save in the database a new File instance in database 
(cars_filtered.arff). It will be related with the user who applied the filter.

Original comment by illoqpa...@gmail.com on 7 Jun 2010 at 6:09

GoogleCodeExporter commented 9 years ago
sure, all the filters available in rapidminer+weka would naturally be supported

my point with "cross OS" is to have wronline deliver ANY kind of filter client 
wants

a GUI interface can be created where client can specify the filter operations
and then we have a 'translator' of that filter into generic SQL query, 
which is then applied to the stored client dataset, the result of the query is 
stored
and made available to the client

this greatly accelerates the development cycle of both rapidminer and weka and
extends the catalogue of available methods / operators

Original comment by harri.sa...@gmail.com on 7 Jun 2010 at 9:08