BlueBrain / nexus-forge

Building and Using Knowledge Graphs made easy
https://nexus-forge.readthedocs.io
GNU Lesser General Public License v3.0
38 stars 19 forks source link

Mismatch between downloaded name and KG value #279

Open mgeplf opened 1 year ago

mgeplf commented 1 year ago

When using kgforge to download this file: https://bbp.epfl.ch/nexus/web/bbp/neocortex/resources/1e2d3085-b591-4041-a33f-d8220cd72a61

It uses the filename "file", instead of the one saved in the knowledge graph:

image

This appears to be because of the code here: https://github.com/BlueBrain/nexus-forge/blob/master/kgforge/specializations/stores/bluebrain_nexus.py#L381

Which gets the following metadata value:

image
crisely09 commented 1 year ago

Hello, To me there is no error, at least not downloading. The problem may have happened when uploading the file, there seems to be a mismatch between the distribution and the actual file. However, you do get the correct file, and it is the "correct" name. I would not consider this a forge problem, but a data registration problem.

mgeplf commented 1 year ago

When the file is downloaded through the nexus web UI, the browser tries to create 535524026.nwb and not file, so it seems to do it right.

How can we prevent these registration mistakes from being made? Can we do a search for all the ones that are incorrect?

crisely09 commented 1 year ago

Yes, I also just uploaded some files using forge, and they are saved with their file names (something.nwb) and not just file, I believe this happened during a data migration process. About updating the wrongly named files, this is something to discuss on the Jira ticket, and maybe I close this issue here, as it's not a forge/Nexus problem.

mgeplf commented 1 year ago

If you download the resource in question through the website (https://bbp.epfl.ch/nexus/web/bbp/neocortex/resources/1e2d3085-b591-4041-a33f-d8220cd72a61), which filename does it save it under?

crisely09 commented 1 year ago

Right, it's the same as if I ask you to download this resource https://bbp.epfl.ch/nexus/web/bbp/neocortex/resources/e3fbe8c5-dd8c-41b1-9953-e22738f1405b?rev=1#JSON, and what filename does it have?

Technically it is not an issue of forge. I understand your point, you would like forge to catch the cases where the file name is file instead of the given as name in the main resource distribution. The problem is that it means we would be catching an error that should not exist, in principle. What needs to be corrected is the files metadata, this I agree with.

crisely09 commented 1 year ago

Maybe it's a good time to ask @MFSY what would be better?

I have stated my point of view, but I am fine with any decision, as long as the issue is solved.