Open ivoflipse opened 11 years ago
Todd Pataky told me to have a look at ITK. Here is some information I could find from presentations and other material:
http://www.itk.org/Wiki/ITK/Examples http://www.itk.org/CourseWare/Training/RegistrationMethodsOverview.pdf
Basically the workflow goes something like this:
I reckon we'd already get pretty far by using the most basic elements, here are some of the functions I noticed in the presentation above:
NormalizeImageFilter?
itk::Euler2DTranform -> Represents a rotation and translation in 2D Transformation/Rotation is about the origin {0,0}
itk::InterpolateImageFunction itk::NearestNeighborInterpolateFunction – Assumes image is piecewise constant z itk::LinearInterpolateFunction – Assumes image is piecewise linear Or B-Spline
itk::ImageToImageMetric Mean Square Difference Correlation Mutual Information
The main difficulty is probably in getting to know the ropes of the Python version of ITK (SimpleITK?) and converting numpy arrays back and forth.
First off, I would probably interpolate the 3D paw data array, because at least in RSscan's case the sensors aren't square nor particularly small. Its not really feasible to divide it down to sub millimeter pieces, because this would make it computationally much more expensive and the error created by having some small loss in information is probably smaller than the influence of a smoothing kernel.
I think I should use an affine transformation, because there can rotation, translation and scaling (difference between hind/front or size of animal).
As for the optimization, a simple mean square difference would probably suffice to get started.
I had found this piece of scipy code that performs image registration, so I thought I'd do a little experiment. I took the first contact of a measurement and made it my template (padded the whole thing with zeros so it could easily be rotated, translated and so everything has equal dimensions). Then I took the next contact of the same paw. Rotated it by 15 degrees, to see if image registration with the template would put it back to its normal angle:
First thing that comes to mind are the artifacts from using scipy.ndimage.interpolation.rotate, which might create significant differences with the original. However, rotation seems to work pretty well, with the contact being rotated -18 degrees.
However when applying it to all the other contacts in the same measurement:
In several cases it works OKish, but in the third and last row it completely messes up. The third row translates the image off screen for no apparent reason. The last row rotates the paw by 90 degrees, possibly because its a mirror image (right vs left), causing it to overestimate the rotation.
I added a check to see if the (absolute) angle was less than 45 degrees and mirrored the image around the vertical axis if so. As you can see we no longer get very awkward results, though rotating externally rotated left feet to externally rotated right feet is still not 'perfect'
When trying to do the same using my template instead of my rotated paw, the results are much more boring. Only the third row is rotated by 9 degrees, the rest are only translated 1 or 2 pixels. Judging from the shape, you'd hope it would rotate them at least a little bit, this might be caused by the lack of interpolation.
Interpolating seems to make things worse: It basically resorts to (extreme) translations. Mirroring the paws doesn't seem to do it much good either.
So here I was trying to get SimpleITK working:
fixed_image = sitk.GetImageFromArray(template) # Inverse is GetArrayFromImage()
moving_image = sitk.GetImageFromArray(paw_rot)
resample_filter = sitk.ResampleImageFilter()
resample_filter.SetInterpolator(sitk.sitkBSpline)
normalizer = sitk.NormalizeImageFilter()
fixed_normalizer = normalizer.Execute(fixed_image)
moving_normalizer = normalizer.Execute(moving_image)
gaussian_filter = sitk.DiscreteGaussianImageFilter()
gaussian_filter.SetVariance(.5)
fixed_smoother = gaussian_filter.Execute(fixed_normalizer)
moving_smoother = gaussian_filter.Execute(moving_normalizer)
fig, (ax1, ax2) = plt.subplots(1,2)
ax1.imshow(sitk.GetArrayFromImage(fixed_smoother))
ax2.imshow(sitk.GetArrayFromImage(moving_smoother))
The next step according to this Registration example from itk.org is to set up a transform. Guess what:
SimpleITK does NOT support registration, I'll be damned! No wonder I couldn't find any of the relevant functions. Guess I'll have to figure out how to translate the above to ITK's Python wrapper...
So I tried installing itk instead, supposedly Python(x,y) comes bundled with it. However, the installer I got was for 32 bit Python and I had switched everything to 64 bit... So I tried compiling ITK from source with the wrappers, but it crashed Visual Studio, so no pie there either.
So I tried switching back to Python 32 bit, only to find out that the installer supplied by Python(x,y) gives a DLL error on 32 bit Windows too... Just my day I guess
I've also found a TU Delft project that made a library for visualizing medical data, which bundles ITK/VTK and some other tools in a portable format. I tried to use their 64 bit version, but ran into issues. Their 32 bit version seems to work, but I haven't tried porting it to my Anaconda Python version. If I could get that working, I could probably do the same for the 64 bit version as well.
Furthermore, I'm trying to install Python(x,y) to see if ITK isn't giving problems there, which might indicate there's a problem with the installer I found or that it has additional dependencies
I think I finally got it working, it was a bit of a hassle to figure out what the right settings were and an even bigger hell to get ITK working in the first place, but the result seems worth it:
Now this is just a first example that shows it works, next up I need to figure out what a great (hopefully universal) template would be or whether I need templates for each different paw. And I need to package this up, so I can apply it to all my paws.
One issue with ITK is that its tricky to install. Eventually I ended up using the TU Delft version, because the Python(x,y) version lacked PyBuffer and I had no idea how to add it manually. But this version requires the user to mock with their Python/system path. They 'solved' this by supplying a .bat file which I believe adds these for you, which may be an option. They supply their whole application as a big zip file which contains all their dependencies, even bundling Python.
Until I come up with an acceptable version, I might use the naive Python version I used above and only switch to ITK if its available on the system.
Given the pain ITK is to install, I've tried imreg again (https://github.com/pyimreg/imreg), which is a small package written by Stefan van der Walt (amongst others). I first had issues running it when running 64 bit Python, because for some reason it wouldn't compile the interpolation cython code they included. On 32 bit Python it ran without problems, so I'll have to see whether it poses a problem for others.
I used it to do a little experimenting. I took all contacts for Krulle and created an average contact:
Then I used image registration to rotate all normalize all contacts and use those to create a new average:
I then tried to do some naive clustering, so I first calculated the mean squared error (from scikit_learn):
diffs = defaultdict(lambda: defaultdict(float))
for index1, contact1 in enumerate(new_padded):
for index2, contact2 in enumerate(new_padded):
diffs[index1][index2] = mean_squared_error(contact1, contact2)
diffs[index2][index1] = mean_squared_error(contact1, contact2)
Then I used applied clustering:
h = []
leaders = {}
explored = set()
clusters = {}
for index1 in sorted(diffs):
clusters[index1] = set([index1])
leaders[index1] = index1
for index2 in sorted(diffs[index1]):
if index1 != index2:
diff = diffs[index1][index2]
heapq.heappush(h, (diff, (index1, index2)))
while h:
score, (index1, index2) = heapq.heappop(h)
leader1 = leaders[index1]
leader2 = leaders[index2]
if leader1 != leader2 and index1 not in explored and score < 0.08:
explored.add(index1)
for node in clusters[leader2]:
clusters[leader1].add(node)
leaders[node] = leader1
if node in clusters:
del clusters[node]
Which clusters everything together as long as the difference between an element within the cluster has a MSE < 0.08 (quite small!). It also only adds an element to a cluster once, which it considers its best match (thanks to the heap).
This gives a large amount of clusters that have only 1 contact, so we make a second pass:
for index, matches in clusters.items():
if len(matches) <= 3:
best = 100
best_index = None
for index2, m in diffs[index].items():
if m < best and index != index2 and index2 not in matches:
best = m
best_index = index2
if diffs[index][best_index] > 0.6:
continue
leader1 = leaders[best_index]
for node in matches:
clusters[leader1].add(node)
if node in clusters:
del clusters[node]
leaders[node] = leaders[best_index]
Where we allow all small clusters to find its best match (unless its really erroneous), the resulting clusters look something like this (at the top I've added the average difference +/- standard deviation):
As you can see, some of the larger clusters also have a higher average difference + std, because small variations in the distribution can lead to differences in the error and especially those errors near our threshold can cause contacts to be clustered that only have overlap in the central toes, whereas the lateral toes are wildly different.
However, the same applies to some smaller clusters, where you can spot 'mistakes' even though the mean error is relatively small (see last example).
Several ways to improve the clustering would be to remove invalid/incomplete contacts, which can lead a cluster into the wrong direction. By tweaking the initial error threshold we can force clusters to be more homogenous, while the second iteration can still add less optimal matches, it can longer make the whole cluster stray in the wrong direction.
Trial and error showed that increasing the threshold in the first iteration leads to one cluster dominating all others, because it contains the best possible matches (by design) and doesn't allow new clusters to form.
Tweaking the first threshold seems to yield more homogeneous clusters, which also means we get more clusters. Making it end with 4 clusters will probably prove difficult.
Another improvement could be a multi-modal approach, where we also include information like contact duration, maximum/average force and contact surface. If you look at the force over time for the contacts in the first cluster, you see there's a large variety between the contacts, raising doubt about the correctness and a sign that using information from these time-series would also be interesting
Note that footscan sensors are not square, but rectangular. So I interpolated all contacts to a ~1x1mm grid, now if you rotate a contact, it won't underestimate its length (because narrow side-way pixels were rotated upwards).
For the image registration I experimented with creating an average of all contacts as my template:
Then using the template to rotate all my contacts using pyimreg: Homography Affine
As you can see, the toes are much better aligned, because they no longer create a smear at the top. I also experimented with using this new average as a template and rotating the contacts again, but the performance gains seemed minimal.
I also changed my normalization function so the mean squared error (MSE) would be more realistic, but subtracting the mean and dividing by the standard deviation removes the absolute differences in pressure between front and hind paws. So I changed my distance function to be the average of the MSE, the maximum pressure of that contact and its pixel count (basically surface). The latter two were calculated by subtracting the the values for 2 contacts and dividing by the value of the contact we were comparing with. I also tried including contact duration, but this feature isn't unique to a particular paw, only to similar paws and thus has a negative impact.
I also messed around with different clustering techniques and I'm still not entirely satisfied with it. The naive approach I was using before had the downside that it would sometimes make sub-optimal merges in the first step causing the cluster to stray off towards other contacts.
It also seems to have trouble distinguishing between left and right for hind paws, because the sensors that should be making the difference (the lateral toes) don't contribute much to the MSE, because whatever error they create, it'll be insignificant. I tried mitigating this by creating a second MSE where I'd only compare values that are below a certain threshold, which should make it more sensitive, but it doesn't seem to help much.
While I'm not really sure yet how to use these calculations in the application, I do think I'm onto something useful. When you create multiple clusters, with a really small difference, you can really see the resemblance. Coupled with human supervision, it could drastically cut down the number of contacts you'd have to label, because it could infer the labels for all contacts that closely resemble it.
I also tried using PCA (like Joe used) and then apply some KMeans to it (or any other clustering really). The results were strikingly similar to my own approach.
The only disappointment seems to be that it works very poorly for really small dogs. When you get to the point where a contact is between 4-10 sensors in total, there aren't enough pixels to make meaningful comparisons. Furthermore, image registration (using pyimreg) gets complicated and in some cases simply gives an error. Perhaps PCA will be better at dealing with it, so we'll have to wait and see.
When I want to compare contacts, there are differences in size and rotation. By using image registration I can normalize a contact, such that it overlaps optimally with a given template contact. The result allows for more accurate comparisons between contacts, which should result in a better heuristic for labeling and better statistics.
See the work by Pataky, Keijsers or Oliveira on this subject on human pressure measurements.
Joe Kington supplied me with a naive implementation:
Basically it reshapes the image to a 20x20 pixel grid and mean normalizes the results. While this is a step in the right direction, it doesn't rotate the paws and it means that hind paws get the same size as front paws, useful information for labeling, which gets lost though it could be added later on in a labeling algorithm.