Decide how to handle coordinates in morphologies

mstimberg commented 8 years ago

I already discussed some part of this with @romainbrette, but this needs some fundamental decisions and should be documented here (long post follows, sorry :) ).

Each compartment of a multi-compartmental neuron is currently modelled as a cylinder with a given length, diameter, etc. This together with the structure of the tree specifies everything needed for the simulation. However, for visualisation (and some special cases like the LFP example), we also need the coordinates in 3D space. For abstract models that are created by attaching cylinders to each other, the coordinates are generated automatically (random direction with the given length). The coordinates are also stored with one value per compartment, currently interpreted as specifying the end-point of the compartment (for cylinders, using the (0, 0, 0) origin as a start for the first cylinder; for a spherical soma, the coordinates specify the centre). This is fine as such, but there is a bit of an inconsistency because the membrane potential v is considered to be taken at the middle of the compartment whereas (x, y, z) and the distance attribute are measured at the endpoint.

The question now is how to handle reconstructed morphologies (e.g. from SWC files). These are not specifying cylinders, but instead points along the morphology. A simple SWC file specifies a cylinder with decreasing diameter like this (the format is index type x y z radius parent):

# A cable of with decreasing diameter starting at 100, 100, 100
1 0 100.0 100.0 100.0 10.00 -1
2 0 110.0 100.0 100.0 7.50 1
3 0 120.0 100.0 100.0 5.00 2
4 0 130.0 100.0 100.0 2.50 3

The following pictures show the morphology as loaded by brian2 on the left and as loaded by NEURON on the right: morphology_comparison

There is an important difference in the way that NEURON handles coordinates: they don't define the compartmentalization of what NEURON calls a section whereas in brian2 we don't keep the two kinds of information separate (and since multi-compartmental modelling is not the main focus of Brian, I think we should leave it this way for the sake of simplicity). Now, Brian's cylinder approximation looks good at the first sight, it starts at the right point and the cylinders have the correct length. However, there are a few issues here. The first one is not so difficult to fix: the cylinders have the wrong diameter, the first radius is 7.5 and not something like (10.0 + 7.5)/2 which would be a better approximation to the truncated cone. The second one is a more fundamental one: how does the first cylinder start at the correct location (100, 100, 100) and not at (0, 0, 0) if we only specify the endpoints of compartments? In fact, the morphology in Brian has 4 compartments, not 3, the first compartment is a length 0 compartment at the given location:

>>> morph.length
(0. * metre, 10. * umetre, 10. * umetre, 10. * umetre)
>>> morph.diameter
(20. * umetre, 15. * umetre, 10. * umetre, 5. * umetre)
>>> morph.area
(0. * metre2, 471.23889804 * umetre2, 314.15926536 * umetre2, 157.07963268 * umetre2)

The problem with this is that we cannot simulate this neuron, since the area of the first compartment is 0 and this will lead to NaNs everywhere.

Now, how can we handle this? I can think of the following solutions:

Shift all re-constructed neurons so that they start at (0, 0, 0), get rid of the dummy compartment at the beginning
Always use the dummy compartment approach to anchor the morphology at some point and special case this so that this compartment is not included in the calculation
Use a more precise approach where compartments are not cylinders but truncated cones and e.g. have attributes diameter_start, diameter_end, x_start, x_end, etc. Either store this in some smart way and automatically keep it consistent (diameter_start = diameter_end of the parent compartment) or for simplicity just store everything explicitly at the expense of wasting some memory.

If we go for 1 or 2, is it worth moving the distances and the (x, y, z) coordinates to the middle of the compartment for consistency with other attributes?

Finally, a question is whether we want to enforce the consistency of length/distance with (x, y, z) information. Starting from a reconstructed morphology, the user might for example merge compartments, keeping the total length the same. For non-trivial shapes, we cannot do this while keeping the old coordinates. A similar problem arises when the user starts with a shape like a cylinder (in which all compartments have the same length) and then changes the length of individual compartments. The initial cylinder has already assigned random coordinates, so these would no longer be consistent with the length.

A somewhat simple and clear solution could be to completely separate the process of morphology creation into two:

Morphologies from coordinates, you can later change the number of compartments (which would trigger some automatic recalculation) but nothing else
Morphologies from lengths, etc. Coordinates would not be random but follow some deterministic scheme and would only be calculated on-demand, for plotting or when the morphology is used to create a SpatialNeuron.

Comments and suggestions welcome, it would be great if we come up with something simple that is yet strict and protects the user from making mistakes (in the current code in master, you can assign whatever you want to every attribute and you have to ask to make the other attributes consistent with explicit method calls, etc.).

thesamovar commented 8 years ago

I don't have strong feelings about this, but the one thing I would say is that if we want to do something different to Neuron we should probably have very strong reasons for doing so, as Neuron is the standard for multicompartmental models.

mstimberg commented 8 years ago

At our last meetings we came up with a few possible solutions. I put up a quick summary of the problem and the proposed solutions in the wiki: @romainbrette Could you have a look, if you are fine with everything then I can go ahead and implement them, it shouldn't take too much time.

rcaze commented 8 years ago

I like the dichotomy made by NEURON between the geometry used in simulation and in representation. Merging both is conceptually simpler, but I think that it is the simplest way to solve the kind of problems you presented here (and a solution familier to NEURON users). I am going to send Marcel a zip file (to keep the post short) containing a ".swc" defining a non-trivial morphology and a ".py" defining a function called segmenter that is taking a Brian Morphology as input. It outputs two new set of coordinates: one for what might be simulated and one for what should be represented as a list of arrays (of length nseg). I am confident to use the last set of coordinates for representation, but for the first set It depends choices made in simulation, for now I just turn the Cylinder into a straight line with segments of constant size. The endpoint of the Cylinder is different from the initial one, this might be problematic (or not) when you link the different cylinders. I have no strong feeling on that. Also I did not address issue of the changing diameter. Note that I did not change the Morphology object.

rcaze commented 8 years ago

On how to set the simulated segment I would go for this option "Let's not care about the consistency (2), allow morphologies to have inconsistent length and x, y, z values." It seems that the start and endpoints are crucial. So one solution would be to set the new (simulated) coordinates to [(st_coor), ..., (mid_coor) ...., (end_coor)] where mid_coors are defined by the distance between these two points divided by number of segments. In this case the new set of coors would no be consistent with the length.

mstimberg commented 8 years ago

Right, so just to make sure I understood correctly: if a morphology defines (in 2D for simplicity) four points as (0, 0), (0, 1), (1, 1), (1, 0) (i.e. a total length of 3) and you want to translate it from three to two compartments, you'd simply use (0, 0), (0.5, 0), (1, 0) as the coordinates but have a length of 1.5 per compartment (assuming same diameters/resistances), right?

rcaze commented 8 years ago

I do not do that in the script I sent you (different end point) but after reading the wiki and thinking about it, I think this is the most elegant solution. Note that in the example you used the represented things would have 5 points and the output of the function would be [array([[0, 0, 0.5], [0, 1, 1]]), array([[0.5, 1, 1],[1, 1, 0]])]. You might want to use this example in a test.

mstimberg commented 8 years ago

Note that in the example you used the represented things would have 5 points and the output of the function would be [array([[0, 0, 0.5], [0, 1, 1]]), array([[0.5, 1, 1],[1, 1, 0]])].

I don't quite get it, why would I have 5 points and what is the "represented things"?

rcaze commented 8 years ago

My goal there is to do a 2-D show shape (like the plot method from the Morpho object) and animate it (coloring it given their voltage). A video of what I would like to do is available on bioarxiv (http://biorxiv.org/content/early/2015/07/24/023200.figures-only) (I did in NEURON, now I want to do it in Brian). The represented things are the points I use in the animation and the simulated thing is the what I use to color the segments.

mstimberg commented 8 years ago

Ok I more or less get what you want to do, but why did you say "Note that in the example you used the represented things would have 5 points " when my example had 4 points? And why would "output of the function" would have z coordinates when I only had a 2d morphology?

rcaze commented 8 years ago

Your example has originally four points. But if you put it as argument with nseg=2 in the "segmenter" function I wrote then It would output five distinct points (three and three). the first element of the list is the x and y of points in the first segment and the second element is the x and y of point in the second segment (look at the test in the script I am about to send).

mstimberg commented 8 years ago

I had a look at the script you sent me, but I am afraid I still don't quite understand what it does. Here's what I thought we agreed we should do:

For each section (in NEURON-speak), keep the coordinates of the start and end point, all other points are equally spaced (or maybe not equally spaced if the initial lengths are not equal) according to the new number of segments
The lengths of the new segments are so that the total length of the section is unchanged -- this means that most likely they are inconsistent from the coordinates calculated as described above.

A concrete example: Assuming this morphology for one section

(0, 0, 0)
(1, 0, 0)
(1, 1, 0)
(0, 1, 0)

So this is an "inverted C shape" with 3 segments (lengths=[1, 1, 1], i.e. a total length of 3). If we want to convert this into 2 segments then I thought we would simply use:

(0, 0, 0)
(0, 0.5, 0)
(0, 1, 0)

with lengths=[1.5, 1.5], i.e. the length information is now inconsistent with the coordinates (which would give lengths=[0.5, 0.5]).

Now your code gives me two sets of coordinates, sec_new_coor and sec_plot_coor, the first one being:

(0, 1.5, 3)
(0, 0, 0)
(0, 0, 0)

And the second one is list of 2x3 coordinates:

(0, 0, 0)
(1, 0, 0)
(1, 0.5, 0)

(1, 0.5, 0)
(1, 1, 0)
(0, 1, 0)

It also returns a single value for the length of 1.5. I don't understand the first sec_new_coor at all and the sec_plot_coor are kind of nice because it stays closer to the original morphology, but what are we supposed to do with these intermediate points? With two segments, shouldn't the list of coordinates according to this algorithm (mapping of distances to points, interpolate for the new distances) just consist of three points? Like this:

(0, 0, 0)
(1, 0.5, 0)
(0, 1, 0)

This was what I had implemented so far -- then I stopped when I realized that this cannot be made consistent with the lengths (we now all agreed on it is fine to not care about this).

I'd be grateful if you could shed some light on this...

rcaze commented 8 years ago

Dear Marcel,

I am going to start by sec_plot_coor. This output is a list of two, because we want two segments. The first segment defines three points and each tuple defines a point '(x, y, z)' the first is (0, 0, 0) the second is '(1, 0, 0)' and the third is '(1, 0.5, 0)' (first half of the U). The second element in the list define the three points used to define the second segment. This sec_plot_coor variable will be super useful to draw and color the neuron according to voltage for instance.

And for 'sec_new_coor' you are right and the output should be what you say [(0, 0, 0), (1, 0.5, 0), (0, 1, 0)]. We should update the code accordingly and maybe wrote a test for that. The output currently given by my script is wrong and the one you give is right.

Notably, It should be decided something about the diameter because it might be changing in the swc file.

mstimberg commented 8 years ago

I am going to start by sec_plot_coor. This output is a list of two, because we want two segments. The first segment defines three points and each tuple defines a point '(x, y, z)' the first is (0, 0, 0) the second is '(1, 0, 0)' and the third is '(1, 0.5, 0)' (first half of the U). The second element in the list define the three points used to define the second segment. This sec_plot_coor variable will be super useful to draw and color the neuron according to voltage for instance.

That's what I still don't understand: why do we need three values per segment? By definition, there will be a single voltage value for each segment, so why do you need three points to plot it?

And for 'sec_new_coor' you are right and the output should be what you say [(0, 0, 0), (1, 0.5, 0), (0, 1, 0)]. We should update the code accordingly and maybe wrote a test for that. The output currently given by my script is wrong and the one you give is right.

Ah, ok, that explains it :)

Notably, It should be decided something about the diameter because it might be changing in the swc file.

Yes, I think the reasonable choice would be to choose the new diameters so that the total area of the section would stay the same.

rcaze commented 8 years ago

That's what I still don't understand: why do we need three values per segment? By definition, there will be a single voltage value for each segment, so why do you need three points to plot it?

In the following function (something in this taste), I need two times three points to draw the two segments (and color these lines afterwards using the two v values).

import brian2 as br2

def add_line(ax, lines, sec):
    """Add a line to a 2-D shape plot (z=0)"""
    diams = br2.asarray(sec.diameter)
    xs = sec.sec_plot_coor_x
    ys = sec.sec_plot_coor_y
    for i in range(len(xs[1:])):
        diam = diams[i]*1e6
        lines[sec.name].append(ax.plot(xs[i:i+2], ys[i:i+2], '-k', color='black', linewidth=diam))
    return lines

P.S: I am learning markdown and it is super cool!

mstimberg commented 8 years ago

Sorry, I have a hard time interpreting your code... Previously, the sec_plot_coor was a list with two elements, each element was a 2d array. Now you have sec_plot_coor_x and this would be what? A flat list of 6 (3 for each segment) points? Could you say what sec_plot_coor_x and y correspond to in the above example?

rcaze commented 8 years ago

Sorry, this should be interpreted more a pseudo code and wishful thinking. I can send you a working example. Working on it!

rcaze commented 8 years ago

This should be (clearer). Remove useless bits and this should almost be working

def add_line(ax, lines, sec):
    """Add a line to a 2-D shape plot (z=0)"""
    # Use segmenter to generate list of lists for x and y
    sec_plot_coor, bla, foo = segmenter(sec)
    xs, ys = [], []
    for seg_coor in sec_plot_coor:    
        xs.append([i[0] for i in seg_coor])
        ys.append([i[1] for i in seg_coor])

    # Trace the different segments
    for i in range(sec.nseg):
        lines[sec.name].append(ax.plot(xs[i], ys[i]))

     # Keep the lines for futur coloring
    return lines

mstimberg commented 8 years ago

This is not quite working either, but I get the idea. Still not sure what your approach is: do you want to to be able to plot something that resembles the original morphology instead of the reduced one? I.e. it is not as if every segment would always have 3 points in its "plot coordinates" but instead this would depend on the number of points in the original morphology? So if you'd have a morphology with 11 points (--> 10 segments) and you reduce it to 2 segments, then each of the segments would store 6 points? I can see the potential usefulness of this (maybe I'd rather call it "original coordinates" or something like that), even though I personally would always prefer to plot the morphology that was actually simulated instead of mapping the coarse values onto a more detailed morphology (otherwise you can't tell the difference between "the voltage in this section is perfectly homogeneous" and "the section was simulated as a single segment"). But I guess this is NEURON's approach for plotting, i.e. it always uses all the Point3D information for a section even if that section has fewer segments?

rcaze commented 8 years ago

Yes, this code was intended to be clearer mainly. This pseudo-function aims to work for an arbitrary number of points per segments and an arbitrary number of segments. Also the number of points can be different from segment to segment. This is why I am using List and not Array as in Array you need to have a square thing. I guess this could go nicely in the refactoring morphology fork.

The NEURON approach is a bit of a cheat true, but it does beautiful animations ;)

rcaze commented 8 years ago

In fact there is no need for a dichotomy in Brian. I now understand better our problems and I think I know how we can use this segmenter function. It would take as input one Morphology object and output a N morphologies defined by N sets of points, these morpholgies would have the same length that is the initial size divided by N. The bit that I do not know is how to preserve the connectivity so that the first and last segments are connected to the parent and to the children and the segments within are connected together.

One could include the segmenter function within the code of a Morphology object. As a setter, so to be able to update the number of segments.

rcaze commented 8 years ago

There is already something called n as an optional argument in a Morphology object. The objective would be to use the segmenter to extend that to all the subtree.

mstimberg commented 8 years ago

Fixed via #652 -- discussion about the segmentation can take place in #636

brian-team / brian2

Decide how to handle coordinates in morphologies #569