kbss-cvut / s-pipes

Tool for execution of RDF-based pipelines.
GNU Lesser General Public License v3.0
4 stars 5 forks source link

Support simple implementation of excel files in tabular module #204

Closed blcham closed 11 months ago

blcham commented 1 year ago

This ticket is simplification of ticket https://github.com/kbss-cvut/s-pipes/issues/201 where we want to support extraction of one sheet only and does not support merged cells (https://github.com/kbss-cvut/s-pipes/issues/215).

After this ticket is done we will extend the implementation to achieve https://github.com/kbss-cvut/s-pipes/issues/201.

A/C:

rodionnv commented 1 year ago

@blcham "...if implemented using property holding mime-type value, it should be checked that combination of delimiter value and mime-type values are together consistent"

I don't quite understand, there will be both parameters "delimiter" and "source-resource-format"? Or there will be only "source-resource-format" and later in the module it should be checked if the actual paraemeter is consistent with format?

blcham commented 1 year ago

if you choose tab-separated-values mime type, it should make by default delimiter "tab" and escaping should be set by default acording to specification as well ... if you override delimiter i would throw exception as it is not tab-separated-values anymore ..

blcham commented 1 year ago

we should always know what specification we refer to and if we set it up, we should be compliant with it or throw exception. Within the exception there should be link to the specification we used.

blcham commented 1 year ago

Also it should be valid to not set mime-type and assume it is plain text with delimiters and quote-ing set explicitely.

blcham commented 1 year ago

links to standards: https://www.iana.org/assignments/media-types/media-types.xhtml