RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
157 stars 61 forks source link

Why restricting the DatabaseType ? (add Access in the list of database types) #118

Open tfrancart opened 3 years ago

tfrancart commented 3 years ago

I need to map an Access database (sight.....) I can access it through a JDBC driver (http://ucanaccess.sourceforge.net/site.html)

What is the reason why rmlmapper restricts the database connections to only a limited subset of databases in DatabaseType ? Isn't a JDBC connection to any database sufficient ?

Would you consider adding Access as a possible DatabaseType ?

Thanks

bjdmeest commented 3 years ago

Thanks for the suggestion!

The reason we're limiting in DatabaseType in the mapper, is twofold:

  1. we use D2RQ to describe connections to SQL databases, which isn't 100% complete in describe, e.g., how to actually form the connection string, and there are some differences between jdbc drivers, which we catch in the RMLMapper (see https://github.com/RMLio/rmlmapper-java/blob/master/src/main/java/be/ugent/rml/access/RDBAccess.java#L64 and following code)
  2. (but this more of a practical issue) we try to have full test coverage for the data sources we support, and creating the test cases for Access just wasn't one of our priorities (the RMLMapper is funded via research projects, so we need to prioritize based on that)

In practice, it's probably a matter of (as you mentioned) adding Access in https://github.com/RMLio/rmlmapper-java/blob/master/src/main/java/be/ugent/rml/access/DatabaseType.java , and maybe fixing some connection string generation inconsistencies in https://github.com/RMLio/rmlmapper-java/blob/master/src/main/java/be/ugent/rml/access/RDBAccess.java to get it working. We'll put it in our roadmap, but it's very hard to make promises on timing. However, if you would be comfortable trying to make the changes yourself, we would be very happy with a pull request! :D

tfrancart commented 3 years ago

Hi, Thanks for the answer - For other reasons I'll try to convert the Access DB to PostgreSQL, and I will probably start from there. So probably not a high priority for me now. Thanks for the pointer to the code, If I really need I will try to add a new DatabaseType myself.

sixdiamants commented 1 year ago

I implemented the suggested modifications and got MS Access to work.
Here are the steps.
Added the data base type to DataBaseType.java

MSACCESS("MS Access",
            "msaccess:",
            "msaccess",
            "jdbc:ucanaccess");

Added the dependency on UcanAccess to the pom.xml:

<dependency>
  <groupId>net.sf.ucanaccess</groupId>
  <artifactId>ucanaccess</artifactId>
  <version>5.0.1</version>
</dependency>

Added below code to RDBAccess.java:

if (databaseType == DatabaseType.MSACCESS) {
  dsn = "jdbc:ucanaccess:"+dsn;
}

Prepending the ucanacess driver.

Recompile the jars.

The RMLMapper now accepts a mapping.ttl with instructions like

<#DB_source> a d2rq:Database;
    d2rq:jdbcDSN "//C:\Users\sixdiamants\workspace\path\to\test.MDB"; 
    d2rq:jdbcDriver "msaccess"; # this is used to detect the ucanaccess driver
    d2rq:username "";
    d2rq:password "" .