NCEAS / z-test-issues

Test issue imports from redmine
0 stars 0 forks source link

Data Manager Library: Create a EML parser lib to digest eml document #405

Closed mbjones closed 7 years ago

mbjones commented 7 years ago

Author Name: Jing Tao (Jing Tao) Original Redmine Issue: 2507, https://projects.ecoinformatics.org/ecoinfo/issues/2507 Original Date: 2006-08-01 Original Assignee: Jing Tao


Currently, the EML actor in Kepler can download eml document and parse it. After parsing, the entity information in eml document will be stored in java object and data file will be download into local file system and also be stored in relation db too. We want to seperate this process from Kepler and make it as lib in eml module. So this library can be used in Kepler, Metacat and some other projects.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Jing Tao (Jing Tao) Original Date: 2006-08-01T18:12:14Z


Here is our plan: Creating 3 packages in eml src dir:

  1. org.ecoinformatics.eml.digestor package and main class is EML200Parser. The main class can be copied from kepler module.

  2. org.ecoinformatics.eml.download package and main class is DataDistributionHandler. The this class will implement Runnable interface and API is DataDistributionHandler(Entity entity); run();

  3. org.ecoinformatics.eml.db package and main class is is TableGenerator. The API of the class is: TableGenerator(Enity entity, File localFile); generateTable(); getTableName(); loadDataToTable();

The function of download package is very similar to cache system of Kepler. I am thinking how to reuse those code in kepler.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Jing Tao (Jing Tao) Original Date: 2006-08-02T18:04:45Z


In order to make download package more configurable, I would like to change to constructor to: DataDistributionHandler(Entity entity, File cacheDir, File fileName);

mbjones commented 7 years ago

Original Redmine Comment Author Name: Jing Tao (Jing Tao) Original Date: 2006-08-04T23:49:20Z


Here is the change in org.ecoinformatics.eml.db package: Main class is SQLCommandHandler and API is: SQLCommandHandler(DBConnection conn, String plugInName) generateTable(Entity entity, File fileName) and it will return the generated table name as string; dropTable(String tableName); excuteSelectionSQLComman(String sqlCommand) and it return a Resultset object;

The org.ecoinformatics.eml.degestor package API is: EML200Parser(InputStream stream); EML200Parser(InputSource source); getEntityList() and it will return a vector; parse();

mbjones commented 7 years ago

Original Redmine Comment Author Name: Jing Tao (Jing Tao) Original Date: 2006-08-08T20:35:45Z


New package name are suggested: org.ecoinformatics.eml.digestor.parser org.ecoinformatics.eml.digestor.download org.ecoinformatics.eml.digestor.db

mbjones commented 7 years ago

Original Redmine Comment Author Name: Matt Jones (Matt Jones) Original Date: 2006-08-08T20:58:54Z


digestor is a bit of a crude name. How about "loader"?

org.ecoinformatics.eml.loader.parser org.ecoinformatics.eml.loader.download org.ecoinformatics.eml.loader.database

This is an improvement but still not totally great. Suggestions welcome.

mbjones commented 7 years ago

Original Redmine Comment Author Name: James Brunt (James Brunt) Original Date: 2006-08-08T21:15:26Z


I like loader but it's not a perfect fit for the way the work is divided which is more like parse (eml) -> create (table) -> source (data). Correct?

mbjones commented 7 years ago

Original Redmine Comment Author Name: Duane Costa (Duane Costa) Original Date: 2006-10-26T16:45:11Z


We named the top-level package "org.ecoinformatics.datamanager". The complete set of packages is:

org.ecoinformatics.datamanager org.ecoinformatics.datamanager.database org.ecoinformatics.datamanager.download org.ecoinformatics.datamanager.parser org.ecoinformatics.datamanager.parser.eml

mbjones commented 7 years ago

Original Redmine Comment Author Name: Duane Costa (Duane Costa) Original Date: 2006-10-26T16:58:15Z


Bug 2504 has been marked as a duplicate of this bug.

mbjones commented 7 years ago

Original Redmine Comment Author Name: ben leinfelder (ben leinfelder) Original Date: 2010-01-11T19:50:02Z


this has been completed. Moreover, it has been extended to support any XML schema that makes use of the EML dataSet module.

mbjones commented 7 years ago

Original Redmine Comment Author Name: Redmine Admin (Redmine Admin) Original Date: 2013-03-27T21:20:25Z


Original Bugzilla ID was 2507