mobie / mobie-utils-python

Python tools for MoBIE
MIT License
9 stars 5 forks source link

add remote metadata fails (spec-v2) #30

Closed martinschorb closed 3 years ago

martinschorb commented 3 years ago

Hi,

I try to add remote metadata and get:

bucket_name = 'centriole-tomo-datasets'
service_endpoint = 'https://s3.embl.de'

add_remote_project_metadata(
    datadir,
    bucket_name,
    service_endpoint
)
Traceback (most recent call last):

  File "<ipython-input-6-6d3f1fe22128>", line 5, in <module>
    add_remote_project_metadata(

  File ".../mobie-utils-python/mobie/metadata/remote_metadata.py", line 24, in add_remote_project_metadata
    datasets = get_datasets(root)

  File ".../mobie-utils-python/mobie/metadata/project_metadata.py", line 68, in get_datasets
    return read_project_metadata(root)['datasets']

KeyError: 'datasets'
martinschorb commented 3 years ago

BTW there is a typo in https://github.com/mobie/mobie-utils-python/blob/spec-v-02/examples/create_mobie_project.ipynb

In the last code block you import the function from metadata but call it as metadata.add_remote....

constantinpape commented 3 years ago

Hi,

I try to add remote metadata and get:

bucket_name = 'centriole-tomo-datasets'
service_endpoint = 'https://s3.embl.de'

add_remote_project_metadata(
    datadir,
    bucket_name,
    service_endpoint
)

datadir needs to be the root directory here, i.e. where the project.json is located.

martinschorb commented 3 years ago

gives me the same error.

constantinpape commented 3 years ago

It works in the tests: https://github.com/mobie/mobie-utils-python/blob/spec-v-02/test/metadata/test_remote_metadata.py#L49 Can you please check in the script into git and send me the link?

Also, your project.json is currently invalid:

{
  "datasets": ["tomo"],
  "imageDataFormats": [
    "bdv.n5"
  ],
    "defaultDataset": "tomo",
  "myeloma centriole dataset.": "myeloma centriole dataset.",
  "specVersion": "0.2.0"
}

The field myeloma ... is not allowed. You can have the optional field description if you want to add some description.

martinschorb commented 3 years ago

OK, I fixed the project.json.

There is no script. It is literally just the function call that I am doing with datadir pointing to the root folder.

martinschorb commented 3 years ago

I cannot extract the datasets:

mm.project_metadata.get_datasets('/g/schwab/Tobias/MoBIE')
Traceback (most recent call last):

  File "<ipython-input-28-2ee1cfca4bac>", line 1, in <module>
    mm.project_metadata.get_datasets('/g/schwab/Tobias/MoBIE')

  File ".../mobie-utils-python/mobie/metadata/project_metadata.py", line 68, in get_datasets
    return read_project_metadata(root)['datasets']

KeyError: 'datasets'
constantinpape commented 3 years ago

Ok, I can try to debug it; can you please commit all local changes in /g/schwab/Tobias/MoBIE before that? (I don't think I have permissions)

constantinpape commented 3 years ago

mm.project_metadata.get_datasets('/g/schwab/Tobias/MoBIE')

The path is wrong, it needs to be /g/schwab/Tobias/MoBIE/data; which is the path where the project.json is.

martinschorb commented 3 years ago

The path is wrong, it needs to be /g/schwab/Tobias/MoBIE/data; which is the path where the project.json is.

That's it!

constantinpape commented 3 years ago

Yes, the error is a bit confusing, because it just loads an empty dict if project.json does not exist. There are internal reasons for this, but I will update the add_remote_project_metadata function so that it checks that the project.json file exists.

martinschorb commented 3 years ago

Can I access a private git repository to MobIE to test the S3 data?

Or I would open the local project and use the S3 storage through Open Project Expert Mode?

constantinpape commented 3 years ago

Can I access a private git repository to MobIE to test the S3 data?

I think that is not supported yet because we do not support github tokens there yet. However, @K-Meech has added github tokens for some other functionality, so it might be relatively easy to reuse it and also support private repos.

Or I would open the local project and use the S3 storage through Open Project Expert Mode?

Yes, that's the way to go now; but I am not sure if it will work out of the box with the new MoBIE.

K-Meech commented 3 years ago

I think that is not supported yet because we do not support github tokens there yet. However, @K-Meech has added github tokens for some other functionality, so it might be relatively easy to reuse it and also support private repos.

Good point. I'll make an issue on the mobie-viewer-fiji repo for supporting private github repos.

martinschorb commented 3 years ago

I get

.../mobie-utils-python/mobie/metadata/remote_metadata.py:58: UserWarning: Could not find data path at .../MoBIE/data/tomo/images/bdv-n5/MMRR_02_Grid4_c157.n5 corresponding to xml .../MoBIE/data/tomo/images/bdv-n5/MMRR_02_Grid4_c157.xml

In the XML, I specify the path inside sub-directories. This structure I'd like to keep as there are already multiple thousands of XMLs in that directory. The n5s I would prefer to leave where they are.

Could this support for sub-directories be implemented?

constantinpape commented 3 years ago

I have just pushed a commit that should fix this, but haven't fully tested it. Could you update your mobie-utils and give it a try?

martinschorb commented 3 years ago

Still the same.

I manually updated the remote XMLs on the main branch and that worked.

martinschorb commented 3 years ago

One small question:

do the XMLs need to go onto the S3 as well or are they taken from GitHub?

constantinpape commented 3 years ago

One small question:

do the XMLs need to go onto the S3 as well or are they taken from GitHub?

We actually have two options for accessing a project on s3 now:

  1. read all data from s3
  2. read metadata from github and image data from s3

for case 1 the xmls need to go onto s3 as well for case 2 they don't

martinschorb commented 3 years ago

OK, thanks! Then I need to do some cleaning there as well to be sure everything is in sync.

martinschorb commented 3 years ago

OK, next try. looks like you forgot some hard-coded path in there...

java.io.FileNotFoundException: /Volumes/emcf/pape/https:/raw.githubusercontent.com/mobie/centrioles-tomo-datasets/master/data/project.json (No such file or directory)
    at java.io.FileInputStream.open0(Native Method)
    at java.io.FileInputStream.open(FileInputStream.java:195)
    at java.io.FileInputStream.<init>(FileInputStream.java:138)
    at de.embl.cba.tables.FileAndUrlUtils.getInputStream(FileAndUrlUtils.java:175)
    at de.embl.cba.tables.FileAndUrlUtils.read(FileAndUrlUtils.java:187)
    at de.embl.cba.mobie.serialize.ProjectJsonParser.parseProject(ProjectJsonParser.java:19)
    at de.embl.cba.mobie.MoBIE.<init>(MoBIE.java:93)
    at de.embl.cba.mobie.command.OpenMoBIEProjectCommand.run(OpenMoBIEProjectCommand.java:26)
    at org.scijava.command.CommandModule.run(CommandModule.java:196)
    at org.scijava.module.ModuleRunner.run(ModuleRunner.java:165)
    at org.scijava.module.ModuleRunner.call(ModuleRunner.java:124)
    at org.scijava.module.ModuleRunner.call(ModuleRunner.java:63)
    at org.scijava.thread.DefaultThreadService.lambda$wrap$2(DefaultThreadService.java:225)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
constantinpape commented 3 years ago

OK, next try. looks like you forgot some hard-coded path in there...

java.io.FileNotFoundException: /Volumes/emcf/pape/https:/raw.githubusercontent.com/mobie/centrioles-tomo-datasets/master/data/project.json (No such file or directory)
  at java.io.FileInputStream.open0(Native Method)
  at java.io.FileInputStream.open(FileInputStream.java:195)
  at java.io.FileInputStream.<init>(FileInputStream.java:138)
  at de.embl.cba.tables.FileAndUrlUtils.getInputStream(FileAndUrlUtils.java:175)
  at de.embl.cba.tables.FileAndUrlUtils.read(FileAndUrlUtils.java:187)
  at de.embl.cba.mobie.serialize.ProjectJsonParser.parseProject(ProjectJsonParser.java:19)
  at de.embl.cba.mobie.MoBIE.<init>(MoBIE.java:93)
  at de.embl.cba.mobie.command.OpenMoBIEProjectCommand.run(OpenMoBIEProjectCommand.java:26)
  at org.scijava.command.CommandModule.run(CommandModule.java:196)
  at org.scijava.module.ModuleRunner.run(ModuleRunner.java:165)
  at org.scijava.module.ModuleRunner.call(ModuleRunner.java:124)
  at org.scijava.module.ModuleRunner.call(ModuleRunner.java:63)
  at org.scijava.thread.DefaultThreadService.lambda$wrap$2(DefaultThreadService.java:225)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

Can you please give some context on what you are trying to do?

martinschorb commented 3 years ago

Hi,

I try to open https://github.com/mobie/centrioles-tomo-datasets from MoBIE. I generated the remote XMLs through add_remote_project_metadata but I don't think there's a problem. Looks like it's happening in the viewer.

constantinpape commented 3 years ago

I try to open https://github.com/mobie/centrioles-tomo-datasets from MoBIE.

Are you using the latest MoBIE-beta version? I cannot check for this dataset, because I don't have permissions to access the s3 data, but I checked for a similar data-set and there it works for me without issues.

I generated the remote XMLs through add_remote_project_metadata but I don't think there's a problem.

I checked the xmls and indeed they look good.

In order to debug further I would need read access for the s3 bucket.

martinschorb commented 3 years ago

I just tried on another Mac with the current MoBIE-Beta and also cannot get access. I've asked to set the bucket to visible without authentication.

martinschorb commented 3 years ago

OK, works now. Also I use a fresh Fiji with MoBIE-Beta. I need to check the other installation to see what went wrong there...