openbudgets / pipeline-fragments

Reusable fragments of LinkedPipes ETL pipelines
2 stars 3 forks source link

FDP2RDF: problem in testing with real datasets from OS #12

Closed HimmelStein closed 7 years ago

HimmelStein commented 8 years ago

@marek-dudas please follow the testing steps, and see what happen from your side: 1: go to http://staging.openspending.org/packager/provide-data 2: login 3: upload a csv (downloaded from https://openspending.org/kutno_w2014/meta) 4: follow the steps of the wizard 5: push the FDP into OS server 6: click 'Profile -> My Datasets' in the login box at the top right of the page 7: click the button 'Run external hooks', to send the datapackage.jsonld into the FDP2RDF pipeline

from my side, with the above csv (https://openspending.org/kutno_w2014/meta), we found that the FDP2RDF pipeline received the processing request, and failed with the following error message in the log file:

2016-08-05 12:11:17,555 [HTTP get list] ERROR c.l.e.e.e.EventFactory - componentFailed com.linkedpipes.etl.executor.api.v1.RdfException: Invalid property: http://plugins.linkedpipes.com/ontology/e-httpGetFiles#fileUri at com.linkedpipes.etl.executor.api.v1.RdfException.invalidProperty(RdfException.java:76) ~[api-executor-v1-0.0.0.jar:na] at com.linkedpipes.etl.component.api.impl.ExceptionFactoryImpl.invalidConfigurationProperty(ExceptionFactoryImpl.java:31) ~[na:na ] at com.linkedpipes.plugin.extractor.httpgetfiles.HttpGetFiles.download(HttpGetFiles.java:106) ~[na:na] at com.linkedpipes.plugin.extractor.httpgetfiles.HttpGetFiles.execute(HttpGetFiles.java:67) ~[na:na] at com.linkedpipes.etl.component.api.impl.SimpleComponentImpl.execute(SimpleComponentImpl.java:290) ~[na:na] at com.linkedpipes.etl.executor.component.ExecuteComponent.run(ExecuteComponent.java:129) ~[executor.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92-internal] 2016-08-05 12:11:17,555 [HTTP get list] ERROR c.l.e.e.e.EventFactory - executionFailed: Component execution failed.

marek-dudas commented 8 years ago

Could you please provide the datapackage.jsonld that was sent to the pipeline? If I run the packager, I might get quite different descriptor. Also, I don't have access to the Fraunhofer server from my home computer, but I might see a possible error directly from the datapackage.jsonld.

HimmelStein commented 8 years ago

@marek-dudas we cannot get the datapackage.jsonld from the OpenSpending platform -- we just click a button to trigger it. one solution is that the pipeline will print the received datapackage.jsonld on the screen, or write in on the log.

you should be able to connect the Fraunhofer server from the UEP computers, as the Access is restricted to the UEP IP-addresses. Please contact @mlukasch, if you have problem in connecting the Fraunhofer server from UEP.

marek-dudas commented 8 years ago

The datapackage.jsonld can be obtained through the Executions view in LP-ETL web UI. I could do that from UEP network, but I am currently at home, so it's difficult for me. I will try to do it this afternoon, but then I will be offline till the end of the week. I will be reimplementing part of the pipeline in order to speed it up during the rest of August, so there will be some changes and more debugging needed anyway.

marek-dudas commented 8 years ago

I think I fixed that directly on the pipeline on Fraunhofer server. I will fix it also in the next GitHub commit. Looking at the pipeline output, it seems there might be some other issue. I will however not fix it now, as it is probably in the part that will be reimplemented (starting from the next week).

HimmelStein commented 8 years ago

it is good to directly test the pipeline on the Fraunhofer server. Just try to use VPN to connect your UEP server. Feel free to contact Maik.

marek-dudas commented 7 years ago

Closing this issue as it was fixed long ago and nobody complained.