linkedpipes / etl

LinkedPipes ETL is an RDF based, lightweight ETL tool
https://etl.linkedpipes.com
Other
147 stars 30 forks source link

Files to RDF tries to determine file format and fails for HTML #821

Closed jakubklimek closed 3 years ago

jakubklimek commented 4 years ago

See this execution

There is an HTML+RDFa file on the input, and RDFa selected as input format. Nevertheless, the execution fails with:

2020-08-12 15:21:49,166 [asynchExecutor-1] ERROR c.l.e.e.e.ExecutionObserver - onExecuteComponentFailed : https://demo.etl.linkedpipes.com/resources/pipelines/1597238444058/component/760e-ad6e
com.linkedpipes.etl.executor.ExecutorException: PipelineComponent execution failed.
    at com.linkedpipes.etl.executor.component.SequentialComponentExecutor.run(SequentialComponentExecutor.java:42)
    at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: com.linkedpipes.etl.executor.api.v1.LpException: Execution failed.
    at com.linkedpipes.etl.executor.api.v1.component.SequentialWrap.execute(SequentialWrap.java:51)
    at com.linkedpipes.etl.executor.component.SequentialComponentExecutor.run(SequentialComponentExecutor.java:38)
    ... 1 common frames omitted
Caused by: com.linkedpipes.etl.executor.api.v1.LpException: Can't determine format for file: rdfa.html
    at com.linkedpipes.etl.executor.api.v1.service.DefaultExceptionFactory.failure(DefaultExceptionFactory.java:12)
    at com.linkedpipes.plugin.transformer.filesToRdfGraph.FilesToRdfGraph.getFormat(FilesToRdfGraph.java:99)
    at com.linkedpipes.plugin.transformer.filesToRdfGraph.FilesToRdfGraph.loadEntry(FilesToRdfGraph.java:84)
    at com.linkedpipes.plugin.transformer.filesToRdfGraph.FilesToRdfGraph.loadFiles(FilesToRdfGraph.java:77)
    at com.linkedpipes.plugin.transformer.filesToRdfGraph.FilesToRdfGraph.execute(FilesToRdfGraph.java:52)
    at com.linkedpipes.etl.executor.api.v1.component.SequentialExecution.execute(SequentialExecution.java:11)
    at com.linkedpipes.etl.executor.api.v1.component.SequentialWrap.execute(SequentialWrap.java:49)
    ... 2 common frames omitted
skodapetr commented 3 years ago

It seems that although RDF4J know RDFa as a format, it does not have support for reading it. https://github.com/eclipse/rdf4j/issues/512