Open glyg opened 3 years ago
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-feb-23/48386/9
cc: @jfkotw @kephale who were also interested during the meeting. I defer on whether or not this issue covers all of "vector-based".
I'd argue for GEOjson for ROIs and points and such & keep meshes in their niche
@glyg, so this block from ply-zarr is the critical bit for discussion?
ply_header = {
"format": "ascii 1.0",
"comments": [f"created by ply_zarr v0.0.1, {datetime.now().isoformat()}",],
"elements": {
"vertex": {
"size": 47,
"properties": [
("double", "x"),
("double", "y"),
("double", "z")
]
},
"face": {
"size": 105,
"properties": [
("list", "uint8", "int32", "vertex_indices"),
]
}
}
}
Yes, this mirrors the specification for the PLY header, then it seems natural to store the faces in separate arrays according to their number of sides.
see a more concrete example of mixing meshes, images and labels here
I assume the xarray compatibility also applies here, I'll look into that next.
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/ngff-status-update-may-2021/52918/1
cc @normanrz
We recently implemented the mesh format from Neuroglancer in webKnossos: https://github.com/google/neuroglancer/blob/master/src/neuroglancer/datasource/precomputed/meshes.md
It's been great for our purposes:
I think that format would be a great candidate to be adopted by OME-NGFF.
@normanrz thanks for the input, those features indeed sound great (esp. multi-res!).
If I understand correctly though, only triangular meshes are supported? The other consumer / producer of meshes is the modeling community (i.e. physical biology), who would need more generic meshes, for example with polygonal (>3) 2D cells, polyhedral 3D cells, or even quadratic tetrahedra.
Would draco be able to handle that kind of data? How would a zarr implementation work? Is it enough to "just" put the draco encoded in the store and add a dependency to be able to read / write them?
Also, maybe storing generic FEM meshes is out of scope for ome-ngff and triangles are enough.
If I understand correctly though, only triangular meshes are supported? The other consumer / producer of meshes is the modeling community (i.e. physical biology), who would need more generic meshes, for example with polygonal (>3) 2D cells, polyhedral 3D cells, or even quadratic tetrahedra.
Yes, I think draco only supports triangular meshes (and point clouds). We could look into allowing other encodings in addition to draco.
How would a zarr implementation work? Is it enough to "just" put the draco encoded in the store and add a dependency to be able to read / write them?
That is a good question that we haven't fully figured out yet. We currently store all the data in a single binary file. The file consists of a) a directory structure (hash map) to locate the meshfile for a specified id within b) a long blob of mesh data. In b) each meshfile has a binary metadata header that describes the available chunks and level-of-details. One implementation on top of zarr would be to store each meshfile as on chunk (e.g. in a 2D uint8 array). This would create a lot of chunk files and might create some issues, because the chunks will have different byte lengths.
I would like to get involved in the discussion.
I think it would be great to have a format similar to the Neuroglancer format in OME. Now the 3D data generation is getting more and more popular in the Spatial Biology field and segmentations are a big part of it. Having the possibility to, beside storing the volumetric (point cloud) data in OME Zarr it would be greally great to have the same possibility to do that for meshes.
I am wondering if there would be the possibility to exchange about the format and specifications in a meeting or such?
I would like to get involved in the discussion.
Consider yourself involved! 🙂
I think it would be great to have a format similar to the Neuroglancer format in OME.
Modulo https://xkcd.com/927/ of course. This is certainly something that I've heard several times recently as well, but it will certainly take one or more champions for it to happen. Also cc @jbms for how he weighs the changes as well as the pros & cons.
I am wondering if there would be the possibility to exchange about the format and specifications in a meeting or such?
Most of the recent meetings have been around the challenge which is pushing forward Zarr v3 support (i.e., RFC-2). It's certainly time for a next general community meeting, or alternatively, a smaller group could start socializing the idea in preparation for a RFC.
I'm still here watching this thread, and would be happy to help get a small group discussing what the best options are for this!
I see!
I think there are similarities and differences between storing volumetric (point cloud data) and the meshes. One main similarity as introduced by the standard Neuroglancer uses is:
Multi-Resolution support for meshes! This is really crucial for the vast amount of meshes we are gonna store and load again
I think the main difference is that meshes don't adhere to such a nice Grid Structure as the point Clouds. So I am wondering how we can store them in their multi resolutions but still know where they are located in XYZ so we can efficiently load them when needed.
So there might be more MetaData to know the Bounding Box, Centroid or other measures to know if a Mesh is visible in a certain location so we can define if it should be loaded by the client or not.
Would really like to see a first Mesh Support (maybe based on the NeuroGlancer format supporting Draco) soon in Zarr
Would really like to see a first Mesh Support (maybe based on the NeuroGlancer format supporting Draco) soon in Zarr
What would meshes look like in the Zarr data model? Zarr v3 doesn't have support yet for variable length types, so at a minimum we would need to add that, and even then I'm not sure how meshes, expressed as variable-length collections of geometrical objects, would be stored in an N-dimensional array. What would the array indices mean? I suspect people would fall back to 1D arrays, with maybe a second array for storing a spatial index? It could work, but it's not a great fit for Zarr IMO.
On the other hand, the neuroglancer multiresolution mesh format seems perfectly fine on its own, outside of Zarr. So maybe just refining or generalizing that format as needed would be simpler than forcing it into Zarr.
I agree that the mesh format doesn't need to live in Zarr arrays. We could (mis)use uint8 arrays, to store the bytes, but I don't know what value that would bring in comparison to just storing the blob alongside the Zarr arrays in the hierarchy. In general, I don't think that all pieces of OME-Zarr need to be Zarr.
So the idea would be to adopt the NeuroGlancer Format (https://github.com/google/neuroglancer/blob/master/src/datasource/precomputed/meshes.md#multi-resolution-mesh-format) and integrate it into OME-Zarr?
So the idea would be to adopt the NeuroGlancer Format (https://github.com/google/neuroglancer/blob/master/src/datasource/precomputed/meshes.md#multi-resolution-mesh-format) and integrate it into OME-Zarr?
I think that would be a good way forward. There are a few details in terms of metadata and file layout that need to be figured out. Would be great to hear @jbms feedback on this.
A quick heads up that I heard from Jeremy today on a separate matter: he's been on leave. I very much assume when he's caught back up he'll chime in.
I just want to get this discussion running again. What would be potential next steps?
I think a meeting to sketch out an RFC would be a good next step. There should be an accompanying post on image.sc to announce that meeting.
@normanrz I'm not sure how crystallized the schedule is for the upcoming OME-NGFF workflows hackathon, but maybe carving out some (EST-timezone-friendly) slots would be convenient?
That sounds like a good plan to discuss in that timeframe!
Sorry, was on paternity leave until today.
As others have also stated, while meshes can be potentially thought of as collections of arrays of vertex properties and face properties, I think trying to represent them as zarr arrays directly would add a lot of complexity and not provide significant advantages, given how meshes are actually used in practice.
There is certainly a lot of room for improvement in the Neuroglancer precomputed multiscale mesh format (and the related annotation format) but I think if the existing format serves a decent number of use cases then it may be wise to standardize it as-is initially, and then once there is greater usage experience work on a revised format.
No worries!
Yes! I think this sounds like a really good plan! I think there is also a great need for more standardized creation and retrieving pipelines for the format. So I like your suggestion of first taking it up as-is and gradually improving it over time.
As discussed in the feb. 2021 ngff community call, and following this image.sc thread
The idea is to follow PLY specification to store meshes in ome-zarr. A ply file is organised in:
There is a draft implementation here: https://github.com/centuri-engineering/ply-zarr
Some questions: