seung-lab / cloud-volume

Read and write Neuroglancer datasets programmatically.
https://twitter.com/thundercloudvol
BSD 3-Clause "New" or "Revised" License
130 stars 45 forks source link

Questions on understanding the workflow for generating mesh "precomputed" data #406

Closed manoaman closed 3 years ago

manoaman commented 3 years ago

Hello,

I am trying to understand the workflow to generate mesh data to visualize in Neuroglancer. I have figured out from the previous examples generating precomputed dataset from tiff stack. However, I have not figured out how to create working mesh data. Suppose if I have precomputed dataset generated from a stack of tiff images, can I generate mesh data from this precomputed data? Or should I be processing from a stack of tiffs to "3d numpy array" and then use cloud-volume ,igneous to generate mesh? It would be great if you could point me to the right direction.

I am looking at these examples.

https://github.com/seung-lab/cloud-volume#info-files---new-dataset https://github.com/seung-lab/cloud-volume/wiki/Example-Single-Machine-Dataset-Upload https://github.com/seung-lab/igneous#in-memory-queue-simple-execution

Thank you, m

william-silversmith commented 3 years ago

Hi m,

You seem to be on the right track. I'm assuming that you're using labeled segmentation and not simply raw EM images. We can generate meshes from the former, but not the latter.

The appropriate sequence is: 1) Convert segmentation into Precomputed format (can now be visualized in neuroglancer) 2) (optional) Use Igneous to downsample the layers
3) Use Igneous to execute MeshTasks against the entire Precomputed dataset (you can use a downsampled version to reduce the amount of work) -- you won't see meshes yet 4) Use Igneous to execute MeshManifestTasks against the meshes which creates files that tell neuroglancer how to view the mesh fragments generated in step 3.

Let me know if you need more guidance.

Will

manoaman commented 3 years ago

Hi Will,

Thank you for pointing me out on an appropriate sequence. And I'm glad you mentioned about raw EM images will not work to generate meshes. (I think I'm confused meshes can be generated from raw images with CloudVolume&Igneous.)

I am looking at the following example (URL) and trying to accomplish generating precomputed mesh data illustrated as "ground-truth" in this example. Because you mentioned raw EM images do not work to generate meshes, what file format I should prepare to work with labeled segmentation in CloudVolume? In the meantime, is there a sample dataset I could use to test the sequence to generate mesh?

Thank you, m

https://neuroglancer-demo.appspot.com/#!%7B%22dimensions%22:%7B%22x%22:%5B8e-9%2C%22m%22%5D%2C%22y%22:%5B8e-9%2C%22m%22%5D%2C%22z%22:%5B8e-9%2C%22m%22%5D%7D%2C%22position%22:%5B2980.186767578125%2C3153.929443359375%2C4045%5D%2C%22crossSectionScale%22:2.8863709892679617%2C%22projectionOrientation%22:%5B0.2747986614704132%2C0.7059817314147949%2C0.6514520049095154%2C-0.041058510541915894%5D%2C%22projectionScale%22:4593.980956070107%2C%22layers%22:%5B%7B%22type%22:%22image%22%2C%22source%22:%22precomputed://gs://neuroglancer-public-data/flyem_fib-25/image%22%2C%22name%22:%22image%22%7D%2C%7B%22type%22:%22segmentation%22%2C%22source%22:%22precomputed://gs://neuroglancer-public-data/flyem_fib-25/ground_truth%22%2C%22tab%22:%22segments%22%2C%22segments%22:%5B%22158571%22%2C%2221894%22%2C%2222060%22%2C%2224436%22%2C%222515%22%5D%2C%22name%22:%22ground-truth%22%7D%5D%2C%22showSlices%22:false%2C%22selectedLayer%22:%7B%22layer%22:%22ground-truth%22%2C%22visible%22:true%7D%2C%22layout%22:%224panel%22%7D

william-silversmith commented 3 years ago

Hi again m,

In other fields of medical imaging, it is possible to produce at least rough meshes directly from grayscale images using the marching cubes algorithm, but in connectomics, the label segmentation problem is too difficult to apply basic techniques unless the neurons somehow picked out manually as they are dense and interwoven.

image

In Igneous, we use the zmesh implementation of the marching cubes algorithm which is specialized for labeled segmentations (i.e. multi-label "binary" images). Labeled segmentations assign a label to every voxel in the imaging field, i.e. the ground-truth segmentation as overlaid over the EM image in kasthuri's dataset above. The meshes are then generated from these dense labelings.

For a nice smallish dataset that you can experiment with, check out CREMI.

manoaman commented 3 years ago

Hi Will,

Thank you for the background and examples. It helps to understand. If I understand correctly, segmented neurons are represented with SWC files. Do both CloudVolume (and/or Igneous) support a single or multiple SWC files to be processed in to precomputed data?

Also, what should be specified for encoding for the initial step of running CloudVolume.create_new_info? I wasn't clear on which to specify raw or compressed_segmentation?

Thank you, m

william-silversmith commented 3 years ago

Hi m,

SWC files correspond specifically to skeleton representations. Mesh representations are stored differently as are voxel labeling representations. You can read more about skeletons here. Igneous operates from the assumption that the most basic unit of segmentation is the labeled voxel representation from which both meshes and skeletons can be derived.

For creating a new segmentation image layer, raw images are just gzip compressed numpy arrays while compressed_segmentation uses some tricks and is typically both smaller and faster IO (and are also gzipped).

Will

manoaman commented 3 years ago

Hi Will,

Is it possible to have a collection of SWC files (skeleton representations) converted into a single tiff file (or another format) which is a mesh representation?

-m

william-silversmith commented 3 years ago

Sort of. If the skeleton nodes come with an associated radius attribute, it's possible to "garb" the skeleton with a set of appropriately sized spheres in voxel space. Then the binary image can be meshed. I don't have much experience with garbing, but based on the small amount of information contained in the radius, only a very rough sketch of the surface is possible to reconstruct. For example, in Kimimaro we generate the skeleton radius using the minimum distance to the boundary, so larger extrusions will be missed. Additionally, many skeletons are downsampled so you lose a lot of fidelity. If the radius wasn't downsampled along with the skeleton, you won't even be able to have more than a stick with bubbles on it.

william-silversmith commented 3 years ago

I haven't read this paper in detail, but it appears to be a review of the different ways to get a coarse mesh from a skeleton: https://hal.inria.fr/hal-01532765v2/document

manoaman commented 3 years ago

Igneous has SkeletonTask and SkeletonMergeTask. Are they compatible with SWC format also?

manoaman commented 3 years ago

Going back to the initial workflow for generating mesh, I've tried running the following code and tried to load files generated in dest_layer_path on Neuroglancer. Although, I only see yellow bounding box and not mesh segmentation. Did I miss anything here? I am seeing mesh_mip_1_err_40 folder generated along with precomputed files.

script.py

from cloudvolume import Vec
from taskqueue import LocalTaskQueue
import igneous.task_creation as tc

# MORF
src_layer_path = 'precomputed://file://../../precomputed'
dest_layer_path = 'file://precomputed_mesh'
mip = 1

with LocalTaskQueue(parallel=8) as tq:

    tasks = tc.create_transfer_tasks(
        src_layer_path, dest_layer_path, 
        chunk_size=(64,64,64), shape=Vec(512,512,512),
        fill_missing=False, translate=(0,0,0), 
        bounds=None, mip=0, preserve_chunk_size=True,
        encoding=None, skip_downsamples=False,
        delete_black_uploads=False
    ) 
    tq.insert_all(tasks)

print("Done downsample !!")

with LocalTaskQueue(parallel=8) as tq2:
    tasks = tc.create_meshing_tasks(             # First Pass
        dest_layer_path, # Which data layer 
        mip, # Which resolution level to mesh at (we often choose near isotropic resolutions)
        shape=(512, 512, 512), # Size of a task to mesh, chunk alignment not needed
        simplification=True, # Whether to enable quadratic edge collapse mesh simplification
        max_simplification_error=40, # Maximum physical deviation of mesh vertices during simplification
        mesh_dir=None, # Optionally choose a non-default location for saving meshes 
        cdn_cache=False, # Disable caching in the cloud so updates aren't painful to view
        dust_threshold=None, # Don't bother meshing below this number of voxels
        object_ids=None, # Optionally, only mesh these labels.
        progress=False, # Display a progress bar (more useful locally than in the cloud)
        fill_missing=False, # If part of the data is missing, fill with zeros instead of raising an error 
        encoding='precomputed', # 'precomputed' or 'draco' (don't change this unless you know what you're doing)
        spatial_index=True, # generate a spatial index for querying meshes by bounding box
        sharded=False, # generate intermediate shard fragments for later processing into sharded format
    ) 
    tasks = tc.create_mesh_manifest_tasks(dest_layer_path, magnitude=3) # Second Pass  
    tq2.insert_all(tasks)
print("Done mesh !!")

info file

{
  "data_type": "uint16",
  "mesh": "mesh_mip_1_err_40",
  "num_channels": 1,
  "scales": [
    {
      "chunk_sizes": [
        [
          64,
          64,
          64
        ]
      ],
      "encoding": "raw",
      "key": "201_201_998",
      "resolution": [
        201,
        201,
        998
      ],
      "size": [
        2048,
        2048,
        96
      ],
      "voxel_offset": [
        0,
        0,
        0
      ]
    },
    {
      "chunk_sizes": [
        [
          64,
          64,
          64
        ]
      ],
      "encoding": "raw",
      "key": "402_402_998",
      "resolution": [
        402,
        402,
        998
      ],
      "size": [
        1024,
        1024,
        96
      ],
      "voxel_offset": [
        0,
        0,
        0
      ]
    },
    {
      "chunk_sizes": [
        [
          64,
          64,
          64
        ]
      ],
      "encoding": "raw",
      "key": "804_804_998",
      "resolution": [
        804,
        804,
        998
      ],
      "size": [
        512,
        512,
        96
      ],
      "voxel_offset": [
        0,
        0,
        0
      ]
    },
    {
      "chunk_sizes": [
        [
          64,
          64,
          64
        ]
      ],
      "encoding": "raw",
      "key": "1608_1608_998",
      "resolution": [
        1608,
        1608,
        998
      ],
      "size": [
        256,
        256,
        96
      ],
      "voxel_offset": [
        0,
        0,
        0
      ]
    }
  ],
  "type": "segmentation"
}
william-silversmith commented 3 years ago

Igneous has SkeletonTask and SkeletonMergeTask. Are they compatible with SWC format also?

Those tasks generate binary files that specify CloudVolume Skeleton objects. Those objects have a skeleton.to_swc() method that you can use as well as a Skeleton.from_swc(...) method.

I only see yellow bounding box and not mesh segmentation.

I think the code structure you posted failed to generate the meshes. I would recommend the following structure that ensures the MeshTasks run first.

with LocalTaskQueue(parallel=8) as tq:
    tasks = tc.create_meshing_tasks(... ) 
    tq.insert_all(tasks)
print("Done Meshing")
with LocalTaskQueue(parallel=8) as tq:
    tasks = tc.create_mesh_manifest_tasks(dest_layer_path, magnitude=3) # Second Pass  
    tq.insert_all(tasks)
print("Done Creating Manifests")
manoaman commented 3 years ago

Those tasks generate binary files that specify CloudVolume Skeleton objects. Those objects have a skeleton.to_swc() method that you can use as well as a Skeleton.from_swc(...) method.

I think I didn't find those functions in the doc, thank you.

Hmm... it seems I'm still not seeing the meshes. Could I still be doing something wrong with running mesh tasks? This is the precomputed files before running mesh tasks and I was able to view in Neuroglancer in a segmentation tab.

https://drive.google.com/file/d/18tqzsfYjJZ-eSbuF17di7HuIBOw1KQ3L/view?usp=sharing

william-silversmith commented 3 years ago

There's some additional documentation here: https://github.com/seung-lab/cloud-volume/wiki/Advanced-Topic:-Skeletons

For some reason I wasn't able to download the segmentation, but I'll try again later. It might be more informative if you are able to show me a listing of files from your mesh directory and in particular if there appears to be an info file in there (not the main info file, a second one).

manoaman commented 3 years ago

Hi Will,

I uploaded the files again and raised the access level. Please try again. Thank you so much for helping me out.

https://drive.google.com/drive/folders/1K-TGEvXpq5K3JafLe9hW2rJh8bd_qfaF?usp=sharing

-m

manoaman commented 3 years ago

Hi Will, were you able to access the previous files? And this is the original tiff stack of segmentation which I was able to view after running the following script. And I'm trying to run mesh task & mesh manifest task after the following script. Could I be doing something wrong?

https://drive.google.com/file/d/13-hsm_yCRTgIwWgvZYHlEhpXbmOni9Bj/view?usp=sharing

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
from concurrent.futures import ProcessPoolExecutor

import numpy as np
from PIL import Image

from cloudvolume import CloudVolume
from cloudvolume.lib import mkdir, touch

info = CloudVolume.create_new_info(  # 'image' or 'segmentation'
                                     # can pick any popular uint
                                     # other options: 'jpeg', 'compressed_segmentation' (req. uint32 or uint64)
                                     # X,Y,Z values in nanometers
                                     # values X,Y,Z values in voxels
                                     # rechunk of image X,Y,Z in voxels
                                     # X,Y,Z size in voxels
    num_channels=1,
    layer_type='segmentation',
    data_type='uint16',
    encoding='raw',
    resolution=[201, 201, 998],
    voxel_offset=[0, 0, 0],
    chunk_size=[256, 256, 1],
    volume_size=[2048, 2048, 96],
    )

# If you're using amazon or the local file system, you can replace 'gs' with 's3' or 'file'

try:
  vol = CloudVolume('file://precomputed_mesh', info=info)
  vol.provenance.description = 'Description of Data'
  vol.provenance.owners = ['....']  # list of contact email addresses

  vol.commit_info()  # generates gs://bucket/dataset/layer/info json file
  vol.commit_provenance()  # generates gs://bucket/dataset/layer/provenance json file

  direct = '.....'

  progress_dir = mkdir('progress')  # unlike os.mkdir doesn't crash on prexisting
  done_files = set([int(z) for z in os.listdir(progress_dir)])
  all_files = set(range(vol.bounds.minpt.z, vol.bounds.maxpt.z + 1))

  to_upload = [int(z) for z in list(all_files.difference(done_files))]
  to_upload.sort()
except IOError as err:
  errno, strerror = err.args
  print ('I/O error({0}): {1}'.format(errno, strerror))
  print (err)
except ValueError as ve:
  print ('Could not convert data to an integer.')
  print (ve) 
except:
  print ('Unexpected error:', sys.exc_info()[0])
  raise

def process(z):
    try:
      img_name = 'Z%02d.tif' % z
      print('Processing ', img_name)
      image = Image.open(os.path.join(direct, img_name))
      (width, height) = image.size
      array = np.array(list(image.getdata()), dtype=np.uint16, order='F')
      array = array.reshape((1, height, width)).T
      vol[:, :, z] = array
      image.close()
      touch(os.path.join(progress_dir, str(z)))
    except IOError as err:
      errno, strerror = err.args
      print ('I/O error({0}): {1}'.format(errno, strerror))
      print (err)
    except ValueError as ve:
      print ('Could not convert data to an integer.')
      print (ve) 
    except:
      print ('Unexpected error:', sys.exc_info()[0])
      raise

with ProcessPoolExecutor(max_workers=8) as executor:
    executor.map(process, to_upload)

-m

william-silversmith commented 3 years ago

Hi m,

I looked at the file structure and I saw there was an info file in the mesh directory. You can try deleting that file and see if it helps. I also saw that the names of the files looked odd. They uses '/' in the file name instead of ':'. I'm not sure how that happened.

manoaman commented 3 years ago

So I tried taking out info file but that only resulted in 404 error from the Neuroglancer. File names do appear '/' in the folder but they actually show up ':' from locally served file server. (Could that be purposed for escaping meta characters?)

One question I had was mip value. It seems I did mip=0 for create_transfer_tasks and mip=1 for create_meshing_tasks. Should the values be consistent and what number I should be specifying here?

(folder structure after mesh tasks.)

├── 1608_1608_998
├── 201_201_998
├── 402_402_998
├── 804_804_998
├── info
├── mesh_mip_1_err_40
│   └── info   ------------------> (Took this out.)
└── provenance
manoaman commented 3 years ago

I tried two things. Consistently use mip=0 in my script. And also took out info file from the folder. And this is what I am seeing.

after_meshing

It appears segmentations look the same without before running mesh tasks. All flat on section planes. Should I be increasing the number of mip and will I start seeing segmentations in volume blocks?

william-silversmith commented 3 years ago

Oh. I know what the problem is. The meshes are written to the directory as .gz which is not how neuroglancer looks for them. This change was necessitated by the fact that ordinary filesystems don't store metadata like the content-encoding of the file so I added a suffix. In order to view this dataset do the following:

from cloudvolume import CloudVolume
cv = CloudVolume('file://./mesh_test')
cv.viewer()
>>> Neuroglancer server listening to http://localhost:1337

Then navigate to precomputed://http://localhost:1337 in Neuroglancer.

image
manoaman commented 3 years ago

Hi Will, sorry I think I'm lost here... if I started the the microviewer and access http://localhost:1337 ,I get 404 and does not load up on Neuroglancer. Do I still take out info file from the mesh folder?

(Access to http://localhost:1337)

Screen Shot 2020-09-22 at 10 50 51 AM

(On Neuroglancer)

Screen Shot 2020-09-22 at 10 53 43 AM

(Terminal)

Screen Shot 2020-09-22 at 10 54 01 AM

-m

william-silversmith commented 3 years ago

You should be able to navigate in your browser to http://localhost:1337/info if that doesn't come up neuroglancer won't see anything. Both info files should be in place.

manoaman commented 3 years ago

Hmm... strange. I do have both info files and I was able to download http://localhost:1337/info . Nevertheless Neuroglancer gives a same error.

william-silversmith commented 3 years ago

Sorry this is being such a troubling process. :-/ One other strategy you can try is to decompress (gunzip) all the .gz files and use either the .viewer or another static file server. Without the gz suffix, neuroglancer will have no problem finding the files.

fcollman commented 3 years ago

this error is because you are serving the data on http, but are running neuroglancer from an https site, so you are getting a cors security error. You need to either run neuroglancer from http or else serve the files with an https compatible server.

manoaman commented 3 years ago

@william-silversmith Thank you for patient with me:) I'm sorry for asking too many things. I'll give it a try with decompress approach.

@fcollman actually, I'm serving both data and Neuroglancer locally on http so I don't think I'm using https at this moment.

william-silversmith commented 3 years ago

Strange, I think you might be onto something Forrest, but when I tried (on MacOS with FF and Chrome) I didn't encounter the error. On the other hand, I've been tweaking this machine for years so who knows if I disabled something.

I was using https://neuroglancer-demo.appspot.com/

manoaman commented 3 years ago

Oh, interesting. When I loaded from https://neuroglancer-demo.appspot.com/ it did show up mesh. I'd have to think the latest Neuroglancer I'm using on my machine is not compatible? Strange but thank you for helping me out @william-silversmith I also figured I needed to double click on segmentations on planes to generate mesh on Neuroglancer. (Should have tried that...)

By the way, is something failing during mesh tasks when I see err in my folder name? mesh_mip_1_err_40 I probably should understand what mip needs to be configured here?

william-silversmith commented 3 years ago

Yay I'm glad it's working! The err in the name refers to the maximum error applied during the mesh simplification process (in this case 40nm displacement of vertices). The mip refers to which mip level it was constructed from. These data are now recorded in the info file too and so it's kind of an anachronism. All is well. :)

On Tue, Sep 22, 2020, 5:46 PM manoaman notifications@github.com wrote:

Oh, interesting. When I loaded from https://neuroglancer-demo.appspot.com/ it did show up mesh. I'd have to think the latest Neuroglancer I'm using on my machine is not compatible? Strange but thank you for helping me out @william-silversmith https://github.com/william-silversmith I also figured I needed to double click on segmentations on planes to generate mesh on Neuroglancer. (Should have tried that...)

By the way, is something failing during mesh tasks when I see err in my folder name? mesh_mip_1_err_40 I probably should understand what mip needs to be configured here?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seung-lab/cloud-volume/issues/406#issuecomment-696999485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATGQSK6C4SCPXICFVZLKZ3SHELLFANCNFSM4RMJLLQA .

manoaman commented 1 year ago

Hello @william-silversmith ,

What are the command line equivalent for the following three tasks? igneous image xfer and igneous mesh xfer?

Thanks, -m

with LocalTaskQueue(parallel=8) as tq:

    tasks = tc.create_transfer_tasks(
        src_layer_path, dest_layer_path, 
        chunk_size=(64,64,64), shape=Vec(512,512,512),
        fill_missing=False, translate=(0,0,0), 
        bounds=None, mip=0, preserve_chunk_size=True,
        encoding=None, skip_downsamples=False,
        delete_black_uploads=False  
    ) 
    tq.insert_all(tasks)

print("Done downsample !!")

with LocalTaskQueue(parallel=8) as tq:
    tasks = tc.create_meshing_tasks(             # First Pass
        dest_layer_path, # Which data layer 
        mip, # Which resolution level to mesh at (we often choose near isotropic resolutions)
        shape=(512, 512, 512), # Size of a task to mesh, chunk alignment not needed
        simplification=True, # Whether to enable quadratic edge collapse mesh simplification
        max_simplification_error=40, # Maximum physical deviation of mesh vertices during simplification
        mesh_dir=None, # Optionally choose a non-default location for saving meshes 
        cdn_cache=False, # Disable caching in the cloud so updates aren't painful to view
        dust_threshold=None, # Don't bother meshing below this number of voxels
        object_ids=None, # Optionally, only mesh these labels.
        progress=False, # Display a progress bar (more useful locally than in the cloud)
        fill_missing=False, # If part of the data is missing, fill with zeros instead of raising an error 
        encoding='precomputed', # 'precomputed' or 'draco' (don't change this unless you know what you're doing)
        spatial_index=True, # generate a spatial index for querying meshes by bounding box
        sharded=False, # generate intermediate shard fragments for later processing into sharded format
    ) 
    tq.insert_all(tasks)
print("Done create_meshing_tasks !!")

with LocalTaskQueue(parallel=8) as tq:
    tasks = tc.create_mesh_manifest_tasks(dest_layer_path, magnitude=3) # Second Pass  
    tq.insert_all(tasks)
print("Done create_mesh_manifest_tasks !!")
william-silversmith commented 1 year ago

Hi m,

Here is a reasonable translation of the commands. The words in ALL CAPS are variables for you to fill in.

igneous image xfer SRC DEST --chunk-size 64,64,64 --mip 0 --queue ./queue
igneous --parallel 8 execute -x queue

igneous mesh forge PATH --mip MIP --queue ./queue
igneous --parallel 8 execute -x queue
igneous mesh merge PATH --magnitude 3 --queue ./queue
igneous execute -x queue

# queue management
ptq status ./queue
rm -r queue
manoaman commented 1 year ago

Thank you Will. For 2x2x2 downsampling, I tried igneous image xfer SRC DEST --chunk-size 64,64,64 --volumetric --mip 0 --queue ./queue. However, I only see the first mip level. Do I need to specify --volumetric and/or additional options in the subsequent commands?

william-silversmith commented 1 year ago

I think I tried that recently and discovered that I did something wrong with the --volumetric option. Let me investigate and get back to you.

william-silversmith commented 1 year ago

Hi m,

It looks like the default --shape argument, which had a z size of 64, was overriding the automatically computed size. I am going to release a fix for this.

william-silversmith commented 1 year ago

Released!

manoaman commented 1 year ago

Thank you Will!!