While it is bad practice the reality is that many of the tabular datasets available are still published as Excel files. Often this is the case because they have multiple sheets and they use Excel to curate those documents.
Some years ago I was using XLwrap. I can do the stuff I need in RML but I liked the fact that I could work directly on Excel files. It would be nice if this could be done in RMLProcessor as well.
I also liked the way that I could address cells and sheets directly. Keys is smarter but I have several files which do not have unique keys in columns so for that addressing a cell like B4 would be useful. Note that supporting sheets would be pretty much mandatory for this feature to be useful.
I think for MS-file formats the library to use would be Apache POI. Not sure about OpenOffice but on the other hand it's probably mainly about published Excels anyway.
While it is bad practice the reality is that many of the tabular datasets available are still published as Excel files. Often this is the case because they have multiple sheets and they use Excel to curate those documents.
Some years ago I was using XLwrap. I can do the stuff I need in RML but I liked the fact that I could work directly on Excel files. It would be nice if this could be done in RMLProcessor as well.
I also liked the way that I could address cells and sheets directly. Keys is smarter but I have several files which do not have unique keys in columns so for that addressing a cell like B4 would be useful. Note that supporting sheets would be pretty much mandatory for this feature to be useful.
I think for MS-file formats the library to use would be Apache POI. Not sure about OpenOffice but on the other hand it's probably mainly about published Excels anyway.