usc-isi-i2 / dig-etl-engine

Download DIG to run on your laptop or server.
http://usc-isi-i2.github.io/dig/
MIT License
101 stars 39 forks source link

Classify tables #86

Open szeke opened 6 years ago

szeke commented 6 years ago

Develop the table classifier. It should read the sample vectors as a resource so that they can be read at startup of the pipeline, and then it should tag each table with a cluster id.

The table classification should only run when the vectors resource is available, so that if no vectors are available, the tables don't get classified.