mobie / mobie.github.io

1 stars 3 forks source link

Describe how to create a S3 project #67

Open tischi opened 2 years ago

tischi commented 2 years ago

@K-Meech @constantinpape

Could we add a step by step documentation about how to create an S3 mobie project? In fact, I would need to do this myself this week :-)

I guess this should contain also something like this:

If you need a new bucket, open a ticket with IT so that they can create one for you. Then copy the whole MoBIE project to the bucket via "mc cp -r data/ embl//"

constantinpape commented 2 years ago

Here is a rough outline (Marking the part that is EMBL specific):

tischi commented 2 years ago

Does one need to select an S3 compatible image data format during the project creation?

constantinpape commented 2 years ago

Does one need to select an S3 compatible image data format during the project creation?

Yes. But only bdv.hdf5 does not support s3, all others are s3 compatible.

K-Meech commented 2 years ago

Hi both - adding the metadata is implemented in the project creator. There's a button (next to the one for opening the project in MoBIE) for adding the required metadata. Happy to add this to the docs - next week though, as I'm on holiday for the rest of this week :)

tischi commented 2 years ago

@K-Meech I tried it and it does add the metadata to the dataset.json but not to the project.json. Should I add it there manually (@constantinpape)?

image
constantinpape commented 2 years ago

Should I add it there manually (@constantinpape)?

Manually you would need to add bdv.ome.zarr.s3 here. But I think this should be fixed in the project creator, s.t. it also updates it in the project.

tischi commented 2 years ago

But I think this should be fixed in the project creator, s.t. it also updates it in the project.

Apparently it does not yet, see screenshot above (ping @K-Meech).

mc cp -r data/ embl/

@constantinpape Does that mean that we do not have the data subfolder on S3? Why not?

constantinpape commented 2 years ago

@constantinpape Does that mean that we do not have the data subfolder on S3? Why not?

Yes, we don't have a data sub-folder on S3. This has the following reason: all the information in our metadata is stored w.r.t. the project root directory, which is data. If we included the data folder, the project root directory would be different from the bucket root and we would need to rewrite all the filepaths for s3 in the metadata.

tischi commented 2 years ago

OK, I see! Maybe something to iron out at some point.

I guess the main advantage of the data folder is that we can have other "project-related-stuff-that-is-not-read-by-mobie" next to it, is it? Maybe to also enable the same in S3, this "misc-stuff" could go into an "misc" sub-folder on the same level as where the project.json lives.

constantinpape commented 2 years ago

OK, I see! Maybe something to iron out at some point.

Sure, changing this would not be difficult. On the mobie-fiji side it should all work already (I added some code a while ago that checks whether the project.json is in the project that's specified or in the data folder). So this only needs to change in the metadata generation.

I guess the main advantage of the data folder is that we can have other "project-related-stuff-that-is-not-read-by-mobie" next to it, is it? Maybe to also enable the same in S3, this "misc-stuff" could go into an "misc" sub-folder on the same level as where the project.json lives.

Yes, the main advantage of the data subfolder is to keep other things separate from the data, especially for projects we have on github where the code for data generation is also part of the github repo.