BlueBrain / nexus-forge

Building and Using Knowledge Graphs made easy
https://nexus-forge.readthedocs.io
GNU Lesser General Public License v3.0
39 stars 19 forks source link

Error when downloading data #253

Closed lidakanari closed 2 years ago

lidakanari commented 2 years ago

Hi, I'm trying to use forge to download a specific taged dataset. Here is the code I used

import getpass
TOKEN = getpass.getpass()
from kgforge.core import KnowledgeGraphForge
bucket = "bbp-external/seu"
forge_tag = KnowledgeGraphForge("./examples/notebooks/use-cases/prod-forge-nexus.yml", bucket=bucket, token=TOKEN)
ls
forge_tag = KnowledgeGraphForge("./nexus-forge/examples/notebooks/use-cases/prod-forge-nexus.yml", bucket=bucket, token=TOKEN)
tag = "V5_20210701"
_type = "NeuronMorphology"
path = forge_tag.paths("Dataset") # to have autocompletion on the properties
data = forge_tag.search(path.type.id == _type, limit=500)
print(str(len(data))+" data of type '"+_type+"' found.")

which finds 400 morphologies. However, when I try :

results = [forge_tag.retrieve(d.id, version=tag) for d in data]

I get an error:

<action> retrieve
<error> RetrievalError: tag requested 'V5_20210701' not found

and when I try to download it fails:

dirpath = "../../../Morphologies/RAW/SEU/All/"
forge_tag.download(results, "distribution.contentUrl", dirpath)
<action> collect_values
<error> DownloadingError: An error occur when collecting values for path to follow 'distribution.contentUrl': not a Resource nor a list of Resource
annakristinkaufmann commented 2 years ago

Hi @lidakanari !

The first error (RetrievalError) is related to the fact that while there are a total of 400 morphologies, not all of them are tagged with V5_20210701. This is due to the fact that we did not yet have all 400 back in July last year when data were tagged with V5_20210701 . The most recent tag is 20220411 - this tag should be available on all 400 neuron morphologies.

I have tried the download on the data retrieved by using the tag 20220411 and that seemed to work for all 400 neuron morphologies.

This is the code snippet I ran using Nexus Forge version 0.7.1:

forge = KnowledgeGraphForge("https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/prod-forge-nexus.yml",
                            token=TOKEN,
                            bucket="bbp-external/seu")

tag = "20220411"
_type = "NeuronMorphology"
data = forge.search({"type": "NeuronMorphology"}, limit=500)
print(f"{len(data)} data of type '{_type}' found.")

results = [forge.retrieve(d.id, version=tag) for d in data]
print(str(f"{len(results)} data of type '{_type}' at tag {tag} found."))

dirpath = "./"
for r in results:
    forge.download(r, "distribution.contentUrl", dirpath)
lidakanari commented 2 years ago

Hi @annakristinkaufmann thanks a lot for the fast response. I have tried the code above and it works without errors.

However, the data are not saved on the dirpath that I specified. Do you know if the download works correctly, or if the data are saved somewhere else by default?

Thanks a lot for your input!

MFSY commented 2 years ago

@lidakanari , @annakristinkaufmann can this issue be closed ?

annakristinkaufmann commented 2 years ago

Hi @MFSY ! I think this can be closed.