saalfeldlab / paintera

GNU General Public License v2.0
99 stars 17 forks source link

Using paintera for large proof-reading effort #401

Open constantinpape opened 4 years ago

constantinpape commented 4 years ago

I am in the process of setting up a proof-reading pipeline based on paintera for a large segmentation volume (https://www.biorxiv.org/content/10.1101/2020.02.26.961037v1.abstract) and wanted to share my strategy to deal with the issues I have encountered so far + issues I am still having:

  1. How to set up an environment for multiple proofreaders? For now, I am sub-dividing the volume into several blocks, then map segments to blocks by biggest overlap and create a new project for each block based on the mapped segments. With this set-up, a split error with parts that are mapped to different blocks cannot be corrected by a single proofreader, so there needs to be another consensus step after proof-read sub-projects are merged.
  2. How to fix big merges? "Unmerging" big objects in paintera is currently not really feasible; detaching and re-merging a lot of fragments takes much too long and painting boundaries and then using flood-fill is very error prone (and also takes very long because flood fill is so slow). Instead, the solution we have settled on is to use a graph watershed (with graph and edge weights derived from the fragments + boundary / affinity predictions) with user-provided seeds. For now, I have a separate napari plugin to do this, but of course it would be smoother to have this integrated in paintera itself.
  3. We need more than a single lock function: having the lock mechanism is great to mark segments as proof-read. However, we often have the situation that a segment cannot be proof-read in the current stage, either because it is split across block boundaries, see 1, or because it contains a large merge that we need to split outside of paintera, see 2. For this it would be very helpful to have a second category of lock, maybe called flag, that would also hide the segment, so that a proofreader does not visit the same segment over and over again, but can be treated differently in the later pipeline.
  4. It would be useful if fragments could be assigned to the background segment id (0) somehow. As far as I know this is currently not possible.

While the solutions for 1 and 2 are not quite optimal, they serve our purposes. If anyone is interested, I am happy to expand on this further and share the code. 4 is only a minor issue.

Point 3 is kind of a show stopper right now. It kills productivity if segments cannot be locked, because it's not possible to correct them right now, and so won't be hidden and are visited over and over again. So we would need to fix this rather soon. To this end, I have two questions: Is this in general a feature you would be interested in having in paintera? If so, would anyone from your side have time to implement it in the short-term? If not, I will try to get someone (with more java experience ...) from our side to work on this; in which case we would be very glad to get some pointers :).

constantinpape commented 4 years ago

@wolny and me implemented the flaggedSegments functionality now by copying the lockedSegments functionality, see https://github.com/constantinpape/paintera/tree/flagged-segments.

This serves our needs for now, but feels a bit hacky because it just duplicates existing functionality. A more general approach would be the following: Support a segment-label-assignment, which can be used to assign segmentIds to labelIds (either from a segment-label-assignment dataset or from user interaction). A label id could have different meanings depending on the use case; for our use case we would just assign segmentIds with a merge error to some specified labelId. In addition, one could add UI support to just display segments with a given labelId, which would be a very useful feature.

hanslovsky commented 4 years ago

Today is my last day at Janelia, so I will cc @axtimwalde and @igorpisarev to include them in the discussion.

constantinpape commented 4 years ago

Today is my last day at Janelia

Yes, I heard :(. Paintera will miss you ;).

@axtimwalde @igorpisarev If you think something along the lines of the segment-label-assignment idea I sketched out above makes sense let me know and I would spin this out into a separate issue.

We might consider contributing this at some point, because this fits some of our use-cases, e.g. marking different cell types or marking different stages of proof-reading.

igorpisarev commented 4 years ago

@constantinpape Your current workflow with subdividing the volume into several blocks sounds good and is probably the best way to go currently, as Paintera doesn't support multiple users working simultaneously on the same project yet. The second lock category also makes sense to me, and I'm glad you were able to implement the workaround quickly.

In general, I think it would be nice to be able to assign tags to segments as you suggested. (I prefer the term "tag" since "label" is already used in the code and sometimes in the UI to indicate fragment or segment IDs). There could also be a section in the UI for creating or deleting tags and changing the visibility of segments marked with each tag.

Can you clarify a bit on (2) in your first post? Is the use-case there to be able to divide a large segment into two smaller segments somehow more efficiently than splitting fragments one by one?

Regarding (4), there is an existing feature request for deleting a fragment which I'm going to look into soon: #347

constantinpape commented 4 years ago

(I prefer the term "tag" since "label" is already used in the code and sometimes in the UI to indicate fragment or segment IDs). There could also be a section in the UI for creating or deleting tags and changing the visibility of segments marked with each tag.

I agree, tag is a better term than label and I like the ideas for the related UI.

Can you clarify a bit on (2) in your first post? Is the use-case there to be able to divide a large segment into two smaller segments somehow more efficiently than splitting fragments one by one?

Yes, the idea is to split large segments into two (or more) segments more efficiently. The way we are doing this now is to let the user provide seeds and then run a graph watershed based on these seeds (with edge weights derived from boundary or probability). This allows to perform splits in the order of seconds, whereas splitting off fragments one after the other can take minutes or even longer and is very error prone.

Regarding (4), there is an existing feature request for deleting a fragment which I'm going to look into soon: #347

Thanks, this is the functionality we would need here as well.