usc-isi-i2 / Web-Karma

Information Integration Tool
http://www.isi.edu/integration/karma/
Apache License 2.0
585 stars 196 forks source link

Null URI being generated in Karma Spark #319

Closed bsnikhila closed 7 years ago

bsnikhila commented 7 years ago

When I use the property crm:P62_depicts in my model, it is generating a Null URI along with the required URI for the corresponding entity. The RDF generated on the UI is giving the right triples and no null URIs. Karma Spark seems to be generating the null URIs

Example model: https://github.com/american-art/cbm/tree/master/CBMAA_Roles (CBMAA_Roles-model.ttl)

dkapoor commented 7 years ago

NIkhila, can you check if its some issue with the workflow itself. I ran it through spark using the attached workflow and it returns the correct triples with P62_depicts in there. I am also attaching the output that I got after running that workflow.

karmaWorkflowCSV.py.zip

part-00000.zip

dkapoor commented 7 years ago

@mit2nil: Is this working? Can we close this issue?

mit2nil commented 7 years ago

@bsnikhila can I run CBM to try out the fix?

mit2nil commented 7 years ago

@dkapoor I just re-ran the workflow for cbm and there are still plenty of cases where I am seeing [ 1 anonymous resource] . Below are couple of examples: http://data.americanartcollaborative.org/page/cbm/object/184 http://data.americanartcollaborative.org/page/cbm/object/100

I am not sure what can go wrong in workflow as we are using similar config for all museums. I am copying config used by our script below for the reference. { 'model_uri': 'file:///opt/aac-repos/cbm/CBMAA_Roles/CBMAA_Roles-model.ttl', 'input_file': '/opt/aac-repos/cbm/CBMAA_Roles/LOD CBMAA Constituents.csv', 'input_file_type': 'csv', 'additional_settings': { 'rdf.generation.selection': 'DEFAULT_TEST', 'karma.output.format': 'n3', 'karma.input.delimiter': ',' }, 'path': './../../aac-repos/cbm', 'output_file_name': 'CBMAA_Roles', 'context_uri': 'https://github.com/american-art/aac-alignment/blob/master/karma-context.json', 'rdf_root_uri': 'http://www.cidoc-crm.org/cidoc-crm/E22_Man-Made_Object1', 'name': 'CBMAA_Roles', 'base_uri': 'http://data.crystalbridges.org/', 'model_file': '/opt/aac-repos/cbm/CBMAA_Roles/CBMAA_Roles-model.ttl', 'output_file': '/opt/aac-repos/cbm/CBMAA_Roles/CBMAA_Roles.n3', 'num_partitions': 1, 'output_dir': '/opt/aac-repos/cbm/CBMAA_Roles/output', 'output_file_type': 'n3' }

Maybe @bsnikhila can add comments if there were any issues found on modeling side or in the data itself.

bsnikhila commented 7 years ago

Sorry for the delay in response. The model seems to be the same as some of the older models that are working fine. I don't think it's an issue with the model.

bsnikhila commented 7 years ago

There was no bug in Karma or in the workflow. Sometime in between, Karma has suggested a few classes and links to those classes by itself in a different model. This had gone unnoticed. I removed those extra links and classes and it works fine now. Sorry for the inconvenience.