Closed ciyer closed 5 years ago
I think I know what has happened here. As we know 'renku log' fails quite often during triples generation and because of that data in our KG is not complete. What this particular exception says is that there's an edge found by the query but we couldn't find a node matching to it. So quite clearly we generated triples for some commit but either for the preceding or following commit, the triples generation was not successful. What we can do? We've got this bug https://github.com/SwissDataScienceCenter/renku-python/issues/616 to fix and maybe so other can be raised as there are other causes of triples generation failures.
This is also visible on dev at https://dev.renku.ch/projects/cramakri/renku-tutorial-flights/files/lineage/notebooks/00-FilterFlights.ran.ipynb
At least it's consistent. Let me try to find the relevant exception so we know what to fix.
I've just done some investigation and it looks it has to be something else as there are no exceptions during triples generations. I'll look into that more.
There was some mysterious logic conditioning the raw data returned from the Sparql lineage query. This logic was removing nodes matching some specific criteria and thus making the result corrupted. That seemed to be wrong and was deleted. The request mentioned in the previous comments does work correctly now.
There was some mysterious logic
🤔
Yeah, there was some logic which I ported from the original implementation done by Jiri. Although I knew how it works, I never understood why do we need it. So I thought, right, let's maybe keep it and there potentially are cases I don't know about. And yes, it got triggered last week and caused some errors :) So I removed the logic, tested different scenarios and did find everything works fine.
Requesting the lineage for one of the notebooks in Renkulab, cramakri/renku-tutorial-flights results in a stack trace on the knowledge graph service:
Viewing
yields
On the server.