Aggregation doesn't give usable/useful results when marked entities overlap

alexbfree commented 8 years ago

For example, here are a couple of cases from Snapshot Serengeti: Computer Vision's recent beta

Subject 1925326 (formerly ASG0018whg in Serengeti)

The crowd opinions are: asg0018whg (collectively there are 55 annotations)

but this aggregates to asg0018whg which still has 17 clusters!

Compare this to a better example, subject 1714356 (formerly ASG001c9b4): (57 clusters) which aggregates to which has 2 clusters, exactly corresponding to the number of animals in the picture - Perfect! Very usable and exactly what we want from the output.

In the case of this project, the science team want to use the aggregate answer to determine what images to crop out and use for training the animal detection computer vision system. Answers like the first one can't be used (and also, can't easily be pulled out and separated from the "good" answers even if we did find some way to handle it manually)

I realise that overlapping animals/regions is a hard problem, but right now the code seems to give up and it doesn't even mark that that image has been handled differently.

Also, there are some cases where the aggregation seems to do a poor job and lose important data. Here is another example:

Subject 1925354 (formerly ASG001e0wq) The crowd opinions look like this: It's quite clear that there almost everyone in the crowd agrees on the approximate location of the three zebras, and their three different directions. The result from the aggregation is, in light of this, not at all useful, and missing important detail on the presence of three animals and their position/direction. Note that the left and right animals are lost, and only a middle animal is shown (wrongly): asg001e0wq

In summary, I think overall we need to

Improve aggregation answers for overlapping entities
Where a good answer is not easy/possible, add some tagging or metadata to that image's results to show that it is suboptimal/needs further attention.

The second is most important of all, I think.

BTW for a sense of the spread of this problem, 23 images of the 66 from the beta are subject to these issues -> about a third. We definitely need to address this if aggregation is to be useful.

ggdhines-zz commented 8 years ago

The rectangle marking tool wasn't designed to handle this sort of case. Perfectly reasonable case to want to handle and I don't think it should be too hard. Semantics but I would count this as an enhancement and not a bug. (I can think of projects where the above behaviour would be wanted - for example using rectangles to outline grass.) So I'll need to figure out a way to report both types of results

alexbfree commented 8 years ago

OK, I take your point that it could be deemed an enhancement to deal with situations like this - but can we at least tag the outputs to identify which images are successfully combined and which aren't? I would say without that, the aggregation is not usable for a project like this.

ggdhines-zz commented 8 years ago

There is always a chance of error. Not just for rectangle aggregation but for all aggregation algorithms and citizen science in general. So if you need 100% accuracy, there's not much I can do for you. The closest example I can think of is Penguin Watch. The clustering there is pretty accurate but sometimes makes mistakes - the only way to know for sure is to examine some of the images.

alexbfree commented 8 years ago

I'm not saying we need to eliminate error. I'm saying we need to differentiate (through some metadata tag or extra field), the "ones we can't do", from the "ones we can".... Seems the algorithm is succeeding in some cases, and failing in others, so if the algorithm knows the difference, we should export that fact in the CSVs

alexbfree commented 8 years ago

Discussed with Greg and he pointed out it is difficult to deduce this algorithmically.

One idea (for users of export data to discern good from bad, or for a heuristic/recommendation the export could provide): One thing we do have in the aggregate output is the average number of clusters marked. So it should be possible to filter subjects based on that... if the number of rectangles post-aggregation is vastly more than the average number of clusters marked per user then we know further attention is definitely needed. If it’s the same or +/- 2, then maybe the results are more likely to be useful.....

chrislintott commented 8 years ago

It’s not that hard - restricting clusters to one per user should do it. See Andromeda Project data reductions for example

On 27 Apr 2016, at 11:49, Alex Bowyer notifications@github.com wrote:

I'm not saying we need to eliminate error. I'm saying we need to differentiate (through some metadata tag or extra field), the "ones we can't do", from the "ones we can".... Seems the algorithm is succeeding in some cases, and failing in others, so if the algorithm knows the difference, we should export that fact in the CSVs

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/zooniverse/aggregation/issues/144#issuecomment-215145029

zooniverse / aggregation

Aggregation doesn't give usable/useful results when marked entities overlap #144