Ignore clusters which is set of random faces

stalker314314 commented 5 years ago

Sometimes, one of bigger clusters tends to be just a bunch of random faces put together. This seems to be limitation of chinese whispers algorithm. In my case, it is the biggest cluster (in Matias it is 3rd). Ideally, we should be able to somehow detect those clusters and either:

ignore them when we create clusters, or
just not show them

Here is one example what they look like: slika

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/69554823-ignore-clusters-which-is-set-of-random-faces?utm_campaign=plugin&utm_content=tracker%2F74944432&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F74944432&utm_medium=issues&utm_source=github).

stalker314314 commented 5 years ago

I tried playing with this and my initial though was to try to get stdev of all 128D of faces in cluster and compare those. This lead nowhere, seems that distances are pretty much equal among "bad" and "good" clusters.

Other metric that seems to work is to measure interconnectedness. Count all connected faces inside cluster (for each pair of faces, count if euclidean distance < 0.5) and divide that by N*(N-1)/2 (N is size of cluster, total possible connections). In ideal case, each face should be connected with each other face and we would get 100%. Results I am getting is that "good" clusters tend to be between 20%-100% and my "bad" cluster is at 5%.

Here is my test, for anyone to replicate:

conn = psycopg2.connect(host="127.0.0.1", database="ncdev", user="ncdev", password="")
cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
cur.execute("select person, count(*) from oc_face_recognition_faces group by person order by 2 desc")
person_groups = cur.fetchall()
cur.close()
for i in range(len(person_groups)):
    person_id = person_groups[i]['person']
    cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
    cur.execute("select * from oc_face_recognition_faces where person=" + str(person_id))
    faces = cur.fetchall()
    descriptors = []
    for i in range(len(faces)):
        descriptors.append(faces[i]['descriptor'])
    faces_numpy = np.asarray(descriptors)
    m = np.mean(faces_numpy, axis=0)
    std = np.std(faces_numpy, axis=0)
    cur.close()

    connected = 0
    for i in range(len(faces)):
        for j in range(i+1, len(faces)):
            if np.linalg.norm(faces_numpy[i] - faces_numpy[j]) < 0.5:
                connected += 1
    interconnectness = 200 * connected / (len(faces) * (len(faces)-1)) if len(faces) > 1 else 100
    print(person_id, len(faces), np.linalg.norm(std), connected, interconnectness)

It would be great if anyone also test these assumptions and report values they are getting (tagging @matiasdelellis , @SlavikCA )

matiasdelellis commented 5 years ago

Wow, It looks interesting!. :smile:

When I recover access to my home server (I'm moving of the apartment and turn it off), I'll play a little with this.. :wink:

matiasdelellis commented 5 years ago

D'Oh!. My servers use mariadb/mysql.. :disappointed:

stalker314314 commented 5 years ago

should be trivial to convert to mysql. There is nothing smart in SQL part, just retrieving 1) list of all persons, 2) getting all face descriptors for each person. I could rewrite it for mysql, if that will help, LMK if you would prefer that!

matiasdelellis commented 5 years ago

should be trivial to convert to mysql.

Of course .. When I can I see it .. :smile:

stalker314314 commented 5 years ago

BTW, one missing piece was to play with 128D vectors and chinese whispers in Python, as pyDlib only could create chinese whispers if you give it face descriptors (which we don't, we have plain old arrays), so I added support for that in python Dlib: https://github.com/davisking/dlib/commit/41a87e5926935c2328e9057a683c6ba6f9214cc9

Now all analysis can be done in Python:D

matiasdelellis commented 5 years ago

I was doing some testing, and with enough data, using threshold of 0.4 to generate the edges, gives excellent results..

The main groups are just excellent. :grimacing:
The strange group is reduced to a few people, but they are all blured/misaligned photos. I could ignore it without worrying..
- NOTE: We can add 'confidence' to the database? Maybe it's interesting for these groups.
The main groups lack faces, but they are grouped into smaller groups... which are also excellent.. We could work in joining groups at least in the view...? Still do not test if they are "stable", but at least they give good results.
There are a lot of more individual photos, which in principle just would not show them in this view..

For now, I'll make this number configurable.. and add an option to force clustering again..

stalker314314 commented 5 years ago

using threshold of 0.4

You mean on this threshold? https://github.com/matiasdelellis/facerecognition/blob/a5b84be242088e7520e5e42780f2c723c7a8f1e3/lib/BackgroundJob/Tasks/CreateClustersTask.php#L215

I always thought that number is very low. Let me try on my machine and share results!

The main groups are just excellent. ... The main groups lack faces,

Not sure how to read this?:) Is first group random?

Still do not test if they are "stable"

If you run clustering more than once and god same clusters, they are stable. This is because chinese whispers is inherently random (using rnd in algorithm). Also, you can also remove/add new image and force clustering again, to double-check if it is stable.

For now, I'll make this number configurable.. and add an option to force clustering again..

I agree. Also, side note, most of the magic numbers in code should be configurable:) Not sure what default value should be - do you think 0.5 or 0.4? Or if you can wait for me to run clustering again?

matiasdelellis commented 5 years ago

You mean on this threshold?

Yes..

I always thought that number is very low.

It can be considered low, since with .6 since probabilistically it gives good results... but the small error (false positives), seem to spread a lot. If the number is smaller, it is more accurate, and it results in more groups, but more precise. On the other hand, it should be easier to join groups than separate them..

Let me try on my machine and share results!

Just comment this line: https://github.com/matiasdelellis/facerecognition/blob/master/lib/BackgroundJob/Tasks/CreateClustersTask.php#L151

Remove the '14' here: https://github.com/matiasdelellis/facerecognition/blob/master/lib/Controller/PersonController.php#L70

.. , and run the command.. :wink:

The main groups are just excellent. ... The main groups lack faces,

Not sure how to read this?:) Is first group random?

I think that the main group is excellent because they do not have any wrong people, but there are other small groups of the same person, who could not group in the main one.

If you run clustering more than once and god same clusters, they are stable. This is because chinese whispers is inherently random (using rnd in algorithm)

Yes .. I mean the smaller groups, but they seem stable. The big ones always seem stable ..

Also, you can also remove/add new image and force clustering again, to double-check if it is stable.

I still not test changing the photos... I'll see maybe tomorrow... :wink:

I agree. Also, side note, most of the magic numbers in code should be configurable:)

Agree.. :+1:

Not sure what default value should be - do you think 0.5 or 0.4?

For the common user, it is better to have precise groups (even if they have different groups of the same person), than having a giant group with wrong people.. Therefore, if you also get good results, I prefer 0.4

But as always it must be well documented .. :wink:

Or if you can wait for me to run clustering again?

If you change we can put a flag in the preferences to recreate the groups .. Or the administrator will do it on demand.. The user does not have to know about the change.

stalker314314 commented 5 years ago

So, I was running clustering with various values from 0.4 - 0.55. Some observations:

I shared images of my setup with you privately
0.6 doesn't work for me with 2GB and 12000 images - it OOMs (:astonished:)
0.55 doesn't make sense - I got one giant cluster and only 18 other clusters..unusable
0.4 - 0.5 hits Goldilocks area and, in this range - we get into personal preferences
I am very annoyed with 0.4 as it is having many clusters of the same persons

Basically, it boils down to:

do you want more errors on wrong people inside the cluster (towards 0.5), or
more clusters of the same person (towards 0.4)

You can put whatever default value you want for common people, I will keep 0.5 for me as it hits my sweet spot:) (I am not a common people:smile:)And this is also what DLib is using. I don't really care about giant cluster as it is 1) less visible now with only 14 faces, 2) we know how we can make it invisible in the future with some math given above.

I am thinking to add some sort of slider on admin section to tweak this between 0.4-0.5 with label "Sensitivity" and two labels on each side "Less persons, lower precision" and "More clusters, higher sensitivity". Does this makes sense?

If you change we can put a flag in the preferences to recreate the groups

We probably don't want to recreate groups, as user might renamed them to his/her likening. If algorithm to make clusters stable is not working, we should fix it, but it should be able to "find its way from one grouping to get to other without destroying existing clusters.

matiasdelellis commented 5 years ago

0.6 doesn't work for me with 2GB and 12000 images - it OOMs (astonished)

Ohh.. You ara doing everything inside the QNAP?? I'm tempted to buy one.. :see_no_evil:

Basically, it boils down to:

do you want more errors on wrong people inside the cluster (towards 0.5), or
more clusters of the same person (towards 0.4)

So, the results are similar ..

I don't really care about giant cluster as it is 1) less visible now with only 14 faces, 2) we know how we can make it invisible in the future with some math given above.

I think you're concentrating on this fronted that hopefully will disappear.. :sweat_smile:

The access to this data, should be, the file application, the gallery, the new viewer¿?, where we will only see a single face in a particular image, where if there are many errors in groups, we can rename an entire wrong group. On the other hand, although there are many groups, if these are precise, they will be renamed progressively as they visualize the images, and the result will be more correct.

On the other hand, in this view, when click a title, we can show all the groups with the same name...?

I am thinking to add some sort of slider on admin section to tweak this between 0.4-0.5 with label "Sensitivity" and two labels on each side "Less persons, lower precision" and "More clusters, higher sensitivity". Does this makes sense?

Yes, I thought something like this .. :wink:

We probably don't want to recreate groups, as user might renamed them to his/her likening. If algorithm to make clusters stable is not working, we should fix it, but it should be able to "find its way from one grouping to get to other without destroying existing clusters.

If the administrator changes the preferences, we must act accordingly, although I think, that is an initial decision, and should not be changed when used in production with users.. It will be the responsibility of the administrator, if it breaks the user groups. :sweat_smile:

matiasdelellis commented 5 years ago

Just upload it here to attach in wiki.. https://github.com/matiasdelellis/facerecognition/wiki/Sensitivity/

:wink:

facerecognition-sensitivity-0_5

facerecognition-sensitivity-0_4

matiasdelellis commented 5 years ago

I see that never comment this, but with a sensitivity of ~~0.5~~ 0.4, at least in my case, all the mixed groups are non-frontal faces.

imagen

..and although there are a few of interesting photos, IMHO, all these faces could ignored.

If we can ignore these photos, it would still be much better than google photos (There are many other faces that can't find with it...)

Well, I will try to add the alignment of the face to the database, and try to detect these cases.

EDIT: Although my approach is valid, my low error rate is because I use 0.4

stalker314314 commented 5 years ago

How you can get alignment of the face? From landmarks is only I can think of? I mean, yes, we have them here: https://github.com/matiasdelellis/facerecognition/blob/edff28127bbd7577880742989e8b29129af84640/lib/BackgroundJob/Tasks/ImageProcessingTask.php#L331, but I never tried playing with them:) (we only passed them further down the processing pipeline) Good luck:D

matiasdelellis commented 5 years ago

How you can get alignment of the face? From landmarks is only I can think of?

For quick tests I am thinking of comparing the distance between the eyes, and the distance between them and the chin. When rotate the faces, I suppose the distance between both eyes gets smaller, and the distance with the chin is maintained... and I will try to work with these proportions.

Yes .. There you get the marks of 68 and 5 points. (At least now with 5 points. :). Note that the only reason to save it in db is if we do something optional. Meanwhile we could continue with the current database.

I just separated these photos, and I'm preparing a small test in php to prove my theory .. :open_hands:

matiasdelellis commented 5 years ago

Ok Ok... The theory failed .. :disappointed: imagen

Basically invents all values, (Well, this is expected :sweat_smile: ), but fails to predict the positions in profile .. :disappointed_relieved:

Therefore we cannot do anything useful here .. Now I will try the 68-point model ..

matiasdelellis commented 5 years ago

Ok.. Using 5 points definitely does not provide useful information. Using the 68-point model certainly improves, but when the face is very inclined, it cannot predict the points and invents them again. imagen

When looking at the points it seems that it looks the other way .. :sweat_smile:

But the value that I had never tried before and seems to be more useful is confidence. In the limited group of photos that I tested, discard many of the faces that are in my strange groups.

I send you by email, an example code with some of my test images. :wink:

If you agree, I would like to add both values to the faces table.

confidence: It will allow me to filter faces of poor quality. Any value < 1.0, could not be grouped, and maybe even show it, but we can make this limit configurable.
lankmarks: We probably won't do anything useful, (The only thing I can think of now is to automatically add glasses. Haha). But I guess it doesn't bother, and we can move forward with these values later.

Of course I have to try it with the most possible number of photos, but for that I must add it to the database.. :sweat_smile:

stalker314314 commented 5 years ago

Yes, having confidence and landmarks is really, really needed. Great to adding them! Once we have confidence, here are my ideas:

have some DEBUG switch (variable/app value) where we can show confidence along thumbnails in personal page. This will add hugely in debugging and any analysis! Just look how awesome are your images when there is confidence slapped on image!:) Anyone can share their set of images easily that way
Confidence should be threshold. We probably need to empirically figure out what should be default value. I guess point above would come handy:)
And if we have confidence threshold, should we discard faces immediately when detected if they are below threshold, or we should just exclude them from clustering? I say....immediately?
And we need to figure out if confidence threshold should be per user or per model. If we normalize confidence on [0-1] interval, I guess it doesn't have to be per model, it can be per user). When I say "per model", I mean - should it be in face_recognition_face_model table. If it is per user or global, it can be in user config values. If we want it to be "per user, per model", we need to put JSON to config user values, like {"model_id":1, "confidence_threshold": 0.8, "model_id":2, ""confidence_threshold": 0.7"}
Back to initial issue, you think that confidences can help with those "set of random faces" problem?
edit: Just came to my mind - if we have confidence threshold (and we discard faces below some threshold), we need to re-run clustering algorithm and to re-assess clustering thresholds. Maybe if we don't have faces with low confidence, clustering will be better (on its own!). Or maybe we will lose set of these random faces...

matiasdelellis commented 5 years ago

Yes, having confidence and landmarks is really, really needed. Great to adding them! Once we have confidence, here are my ideas:

:wink:

have some DEBUG switch (variable/app value) where we can show confidence along thumbnails in personal page. This will add hugely in debugging and any analysis! Just look how awesome are your images when there is confidence slapped on image!:) Anyone can share their set of images easily that way

I have been thinking about this for a long time. We need an advanced view where you can see all the data, and you can interact completely -Reset photos, groups, see the graphs of relationships-, and not work with the database for testing.

Confidence should be threshold. We probably need to empirically figure out what should be default value. I guess point above would come handy:)

First we must evaluate with real information, but 1.0 seems to be the magic number ..

And if we have confidence threshold, should we discard faces immediately when detected if they are below threshold, or we should just exclude them from clustering? I say....immediately?

For the development to be progressive, what I did in the pull request is to group them individually. With a high value, ¿2.0? the operation is like until now..

And we need to figure out if confidence threshold should be per user or per model. If we normalize confidence on [0-1] interval, I guess it doesn't have to be per model, it can be per user). When I say "per model", I mean - should it be in face_recognition_face_model table. If it is per user or global, it can be in user config values. If we want it to be "per user, per model", we need to put JSON to config user values, like {"model_id":1, "confidence_threshold": 0.8, "model_id":2, ""confidence_threshold": 0.7"}

I think it should be like the sensibility parameter, something very advanced for users .. and leave it as an administrator parameter. Maybe in the view of faces by image in files, we can sort them by this parameter so that the most reliable faces are first shown

Back to initial issue, you think that confidences can help with those "set of random faces" problem?

In principle we can reaffirm that the only task of this parameter is to discard faces. With my tests with photos of faces that I really wanted to discard, we could say that It discard half ... and It don't discard any of my important faces that I wanted to keep.

Now it is pending to make a real test, and confirm that it follows this behavior.

Minimally it will discard unfocused faces, and in strange positions, that although they can be familiar faces IMHO it’s acceptable. The important thing in my perspective is not to discard faces that could really be grouped.

edit: Just came to my mind - if we have confidence threshold (and we discard faces below some threshold), we need to re-run clustering algorithm and to re-assess clustering thresholds. Maybe if we don't have faces with low confidence, clustering will be better (on its own!). Or maybe we will lose set of these random faces...

For my part, I would expect that, minimally, the random group is drastically reduced. And although some may remain, it will be very down on the list

matiasdelellis commented 5 years ago

I write this here also in case anyone is interested .. :sweat_smile:

I am analyzing my images again testing the pull request https://github.com/matiasdelellis/facerecognition/pull/164 It will take several more hours, but these are the preliminary conclusions.

With 595 faces analyzed until now:

If completely ignoring the confidence -Using a value of 0.0-, there are 338 persons, and there is a random group of 9 faces...
If use an confidence < 0.8, -there are 71 faces (12%) with these values- and this results in 351 persons (13 more). and the mixed group no longer exists.
If use an confidence < 1.0, -there are 127 faces (21%) with these values- and this results in 354 persons (16 more). and the mixed group also no exists.

In percentage terms the number of photos with less confidences that 1.0 seems a lot, but seeing the little variation of the groups, I think this is going to be very interesting. :grinning:

matiasdelellis commented 5 years ago

Well.. These are my conclusions ..

WIth:

Images	3049
Faces found	1774
Sensibiltity	0,4

... and changing only the confidence parameter.

Min confidence	Nro Persons	% Incresead	Faces on Main Mixed Group	% of total	Time (S)	%
0,00	789	100,00 %	26	1,47 %	14,922	100,00 %
0,10	792	100,38 %	26	1,47 %	14,504	97,20 %
0,20	794	100,63 %	26	1,47 %	14,221	95,30 %
0,30	794	100,63 %	26	1,47 %	13,988	93,74 %
0,40	798	101,14 %	24	1,35 %	12,642	84,72 %
0,50	807	102,28 %	17	0,96 %	13,117	87,90 %
0,60	814	103,17 %	17	0,96 %	13,038	87,37 %
0,70	818	103,68 %	8	0,45 %	12,599	84,43 %
0,80	829	105,07 %	8	0,45 %	12,566	84,21 %
0,90	838	106,21 %	5	0,28 %	12,526	83,94 %
1,00	866	109,76 %	2	0,11 %	11,978	80,27 %

So the magic number seems to be between 0.7 and 1.0..

In percentage terms, the increase of persons may seem a lot, (10% with 1.0, or 5% 0.9), but not forget that these extra persons remain individual, and are stable by themselves. Therefore, if we edit its name in the file side panel, we can be sure that we will not modify an entire incorrect group. .. and since the confidentiality of face is a fixed number, if this parameter is not changed, we can be sure that tha face not regrouped incorrectly between analysis..

Also note that with 1.0 there are still mixed groups, but they were drastically reduced, and honestly with photos that I could discard.

Finally, as comparisons of photos with little confidence are avoided, the time used for grouping is reduced.

EDITED: Fix percentage of faces in main group of mixed person..

matiasdelellis commented 5 years ago

Testing with sensibility 0.5

Images	3049
Faces found	1774
Sensibility	0,5

Min confidence	Nro Persons	% Increased	Faces on Main Mixed Group	% of total
0,00	300	100,00 %	201	11,33 %
0,50	331	110,33 %	162	9,13 %
1,00	440	146,67 %	49	2,76 %

Well, Even with the confidence of 1.0, there is a relatively large mixed group, and the main groups have errors. But and although the main groups also have errors, using confidence dramatically improves this problem.

stalker314314 commented 5 years ago

I think this is nice analysis, but I think this is not addressed:

And if we have confidence threshold, should we discard faces immediately when detected if they are below threshold, or we should just exclude them from clustering? I say....immediately?

If we immediately ignore found face (do not insert it in DB at all!), I think we can have better results. First of all - you will not have problem on this line: https://github.com/matiasdelellis/facerecognition/pull/164/files#diff-324fc525c91828dc868593c5154275a8R229-R232 (there would be no question whether to create edge or not). Second - right now, in your analysis, number of persons is growing with confidence (BTW, is sensibility == confidence?) which is strange result. And it is strange result, I think, becausee each face < min_confidence is separate person now. I don't think this is good approach. So, if you ignore face immediately (I would get rid of them in ImageProcessingTask.php somewhere here: https://github.com/matiasdelellis/facerecognition/blob/master/lib/BackgroundJob/Tasks/ImageProcessingTask.php#L172) and re-run analysis, I think you will get more revealing results! Number of persons will be lower as confidence grows and number of faces will be lowered too.

It think it is not semantically correct to leave faces with low confidence in DB at all. They don't mean anything, they cannot participate in clustering logic, they are just garbage...

It basically boils down to removing "if" from CreateClusterTask and add if to ImageProcessingTask:) What do you think?

matiasdelellis commented 5 years ago

I think this is nice analysis, but I think this is not addressed:

Agree that it does not solve this problem, but it improves substantially.

If we immediately ignore found face (do not insert it in DB at all!), I think we can have better results. First of all - you will not have problem on this line: https://github.com/matiasdelellis/facerecognition/pull/164/files#diff-324fc525c91828dc868593c5154275a8R229-R232 (there would be no question whether to create edge or not). Second - right now, in your analysis, number of persons is growing with confidence (BTW, is sensibility == confidence?) which is strange result. And it is strange result, I think, becausee each face < min_confidence is separate person now. I don't think this is good approach. So, if you ignore face immediately (I would get rid of them in ImageProcessingTask.php somewhere here: https://github.com/matiasdelellis/facerecognition/blob/master/lib/BackgroundJob/Tasks/ImageProcessingTask.php#L172) and re-run analysis, I think you will get more revealing results! Number of persons will be lower as confidence grows and number of faces will be lowered too.

It think it is not semantically correct to leave faces with low confidence in DB at all. They don't mean anything, they cannot participate in clustering logic, they are just garbage...

To think that it is only garbage is debatable. Beyond the blurring, there are faces that are clearly recognizable, but due they are in a strange positions, the confidence number is low... and when you try, you will see that this value is far from being linear (Image/face quality, vs confidence).

Until we can assure this, with more evidence, I suppose the less invasive is that there remain individuals. With this approach, when changing the value of the confidence, the process only has to regroup the faces.

.. but trying your suggestion... Suppose I am testing how I did for these tables. What happens if completely discard 200 faces on 120 photos and then change the confidence? I have to completely analyze 120 photos, because before I preferred a better 'quality'?

NOTE: We do not have a process to reanalyze the photos.. That current faces are not invalid. Only some faces are missing in some images, but any information you have is still valid.

Well, Think in terms of energy, cpu charge, etc. Should we analyze again for a change that only affects clustering? We can try not to show them, and if you want not group them, but not discard information that has cost to obtain again.

BTW, is sensibility == confidence?

No.. Both determine how faces will be clustering, but they are completely different.

Confidence, it is a value that says how safe the model is, that what he got is a face.. and is an intrinsic data of face.
Sensitivity, says how similar two faces should be to try to group them.

stalker314314 commented 5 years ago

BTW, is sensibility == confidence?

OK, got it! I mixed sensitivity and sensibility:)

NOTE: We do not have a process to reanalyze the photos..

Got it. And yes, I agree it would be wasteful to reanalyze everything again:( OK, then...can we keep all found faces in DB, but hide them completely if they are below < confidence, like they never exist. That means not showing them in frontend, do not use them in clustering...just completely ignore them? I would just really like to try to not create separate persons out of them, or show them at all (to obtain better precision, even though recall might suffer, but overall UX will be better). I think this will help create better persons/clusters.

BTW, please ignore my thinking around those faces below confidence, it should not block you to push this change! We can always decide later what to do with "garbage" faces.

matiasdelellis commented 5 years ago

Hi @stalker314314 Finally, I merged it, and I hope that the improve this report, however, it is obvious that it will not solve it by itself. :sweat_smile:

Please when you finish analyzing your photos evaluate the results.. :smiley: Then if you want, we can close this report, and start a clean new one to think again.. :wink:

matiasdelellis commented 4 years ago

I had pending explain a little the 'minimum confidence' option..

https://github.com/matiasdelellis/facerecognition/wiki/Confidence

matiasdelellis commented 1 year ago

Ok.. I think the results have improved, and if you use model 4 (which uses everything discussed here), the groups are almost perfect, but in any case, now you can hide junk clusters (or just non-relevant people), and when tagging any face within these groups, it will only do so on that face.

matiasdelellis / facerecognition

Ignore clusters which is set of random faces #114