Open hanckmail opened 6 years ago
your mean that recognition waste 30 minutes per photo? it should not be so slow, can you show your recognition code or your all code. I will help you to check this issue. you also can try mutliple process, it may make your recognition faster.
No 30 minutes per 5000 photos, but i think my GPU not involved in process of recognition.
Sorry i think i found what i needed
import face_recognition
import pickle
all_face_encodings = {}
img1 = face_recognition.load_image_file("obama.jpg")
all_face_encodings["obama"] = face_recognition.face_encodings(img1)[0]
img2 = face_recognition.load_image_file("biden.jpg")
all_face_encodings["biden"] = face_recognition.face_encodings(img2)[0]
# ... etc ...
with open('dataset_faces.dat', 'wb') as f:
pickle.dump(all_face_encodings, f)
# Load face encodings
with open('dataset_faces.dat', 'rb') as f:
all_face_encodings = pickle.load(f)
# Grab the list of names and the list of encodings
face_names = list(all_face_encodings.keys())
face_encodings = np.array(list(all_face_encodings.values()))
# Try comparing an unknown image
unknown_image = face_recognition.load_image_file("obama_small.jpg")
unknown_face = face_recognition.face_encodings(unknown_image)
result = face_recognition.compare_faces(face_encodings, unknown_face)
# Print the result as a list of names with True/False
names_with_result = list(zip(face_names, result))
print(names_with_result)
I just need some help, how can i import whole folder, with about 3000 photos named as 1,2,3,4 etc, and i need only 'True' results not whole list or 'True' faces to be opened
or how can i import my dataset instead of train folder in face_recognition_knn.py or instead of folder people_i_know in face_recognition, Please anybody help(
p.s. in future if i add more files how can i update "dataset_faces.dat" without deleting it , i have to change "wb" ?
can anybody give an example how can i train model in face_recognition_knn.py model_save_path = "" - must be it a path, or a file name please guys 1 example of full working script trainig and prediction, i didnt understand how to make it works
You just need to keep the training results in a txt file at the first time.
Like this:
knn_clf = train("knn_examples/train", "./model/model.txt")
and then preddict like this:
preds = predict(join("knn_examples/test", img_path), model_save_path="./model/model.txt")
If the training data remains the same, you don't need to training model.
thank you very very much bro, now everything works perfectly, i have 3 more questions: 1 - how many faces can be stored in txt model 2 - if i add more faces i need to train everything from begining, or there is a command to continue training. 3 - if i have 2 or 3 people that look like each other in result i will have only one of them, how can i get all of them for example result in a txt file?
1-As many as u want to train
2- I think, With current code, we have to train again. Would be helpful if someone can comment on this point. Do we need to train whole dataset(ex: 10,000 images) when new observation gets added up ? Something like incremental/batch learning algorithms with nearest neighbours ?
3-I think u need to play with kneighbours
i all most got it my friend , i added the string "print(closest_distances)" closest_distances = knn_clf.kneighbors(faces_encodings, n_neighbors=8) print(closest_distances)
it prints 8 closest neighbors but not their name only distance ((array([[0. , 0.38515912, 0.81486565, 0.8384682 , 0.8384682 , 0.86994276, 0.93531323, 0.93531323]]), array([[3, 7, 4, 2, 6, 0, 1, 5]]))) like this, please help me to add names )
closest_distances = knn_clf.kneighbors(faces_encodings, n_neighbors=1)
is_recognized = [closest_distances[0][i][0] <= DIST_THRESH for i in range(len(X_faces_loc))]
return [(basename, pred) if rec else (basename, "No match found") for pred, rec in zip(knn_clf.predict(faces_encodings), is_recognized)]
You need to modify these 3 lines of code. In, closest_distances[0][i], the first 0 is the array([[0. , 0.38515912, 0.81486565, 0.8384682 , 0.8384682 , 0.86994276, 0.93531323, 0.93531323]]) In, closest_distances[1][i], it is array([[3, 7, 4, 2, 6, 0, 1, 5]]))....
So, keep a for loop and get all the nearet neighbours to is_recognized variable. Later, for all those recogized values, print the names. Printing the names is done at 3rd line when is_recognized value is true
BTW, does ur 2nd point got clarification ?
sorry, im newbie i didnt understand anything ((
as i understand 0.38515912 - is distance of second most similar face, my problem is in printing his name.
i just want to see
- , 0.38515912, 0.81486565, 0.8384682 , 0.8384682 , 0.86994276, 0.93531323, 0.93531323
their names instead of distance.
Sorry for my stupidity.
Can u print your is_recognized and tell me ? (the next line of closest_distances)
If ur result is just: [True] or getting error like "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()", Try with numpy arrays to store boolean values.
Once, we get the boolean values, we can later work on printing names. Let me know if you are able to print all the 8 nearest boolean values.
I divided standard face_recognition_knn.py into two parts: 1st to train dataset once:
`from math import sqrt from sklearn import neighbors from os import listdir from os.path import isdir, join, isfile, splitext import pickle from PIL import Image, ImageFont, ImageDraw, ImageEnhance import face_recognition from face_recognition import face_locations from face_recognition.cli import image_files_in_folder
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg'}
def train(train_dir, model_save_path = "", n_neighbors = None, knn_algo = 'ball_tree', verbose=False): """ Trains a k-nearest neighbors classifier for face recognition.
:param train_dir: directory that contains a sub-directory for each known person, with its name.
(View in source code to see train_dir example tree structure)
Structure:
<train_dir>/
├── <person1>/
│ ├── <somename1>.jpeg
│ ├── <somename2>.jpeg
│ ├── ...
├── <person2>/
│ ├── <somename1>.jpeg
│ └── <somename2>.jpeg
└── ...
:param model_save_path: (optional) path to save model of disk
:param n_neighbors: (optional) number of neighbors to weigh in classification. Chosen automatically if not specified.
:param knn_algo: (optional) underlying data structure to support knn.default is ball_tree
:param verbose: verbosity of training
:return: returns knn classifier that was trained on the given data.
"""
X = []
y = []
for class_dir in listdir(train_dir):
if not isdir(join(train_dir, class_dir)):
continue
for img_path in image_files_in_folder(join(train_dir, class_dir)):
image = face_recognition.load_image_file(img_path)
faces_bboxes = face_locations(image)
if len(faces_bboxes) != 1:
if verbose:
print("image {} not fit for training: {}".format(img_path, "didn't find a face" if len(faces_bboxes) < 1 else "found more than one face"))
continue
X.append(face_recognition.face_encodings(image, known_face_locations=faces_bboxes)[0])
y.append(class_dir)
if n_neighbors is None:
n_neighbors = int(round(sqrt(len(X))))
if verbose:
print("Chose n_neighbors automatically as:", n_neighbors)
knn_clf = neighbors.KNeighborsClassifier(n_neighbors=n_neighbors, algorithm=knn_algo, weights='distance')
knn_clf.fit(X, y)
if model_save_path != "":
with open(model_save_path, 'wb') as f:
pickle.dump(knn_clf, f)
return knn_clf
if name == "main": knn_clf = train("knn_examples/train", "knn_examples/train/model/model.txt")`
And 2nd to get result:
`from math import sqrt from sklearn import neighbors from os import listdir from os.path import isdir, join, isfile, splitext import pickle from PIL import Image, ImageFont, ImageDraw, ImageEnhance import face_recognition from face_recognition import face_locations from face_recognition.cli import image_files_in_folder
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg'}
def predict(X_img_path, knn_clf = None, model_save_path ="knn_examples/train/model/model.txt", DIST_THRESH = .45): """ recognizes faces in given image, based on a trained knn classifier
:param X_img_path: path to image to be recognized
:param knn_clf: (optional) a knn classifier object. if not specified, model_save_path must be specified.
:param model_save_path: (optional) path to a pickled knn classifier. if not specified, model_save_path must be knn_clf.
:param DIST_THRESH: (optional) distance threshold in knn classification. the larger it is, the more chance of misclassifying an unknown person to a known one.
:return: a list of names and face locations for the recognized faces in the image: [(name, bounding box), ...].
For faces of unrecognized persons, the name 'N/A' will be passed.
"""
if not isfile(X_img_path) or splitext(X_img_path)[1][1:] not in ALLOWED_EXTENSIONS:
raise Exception("invalid image path: {}".format(X_img_path))
if knn_clf is None and model_save_path == "":
raise Exception("must supply knn classifier either thourgh knn_clf or model_save_path")
if knn_clf is None:
with open(model_save_path, 'rb') as f:
knn_clf = pickle.load(f)
X_img = face_recognition.load_image_file(X_img_path)
X_faces_loc = face_locations(X_img)
if len(X_faces_loc) == 0:
return []
faces_encodings = face_recognition.face_encodings(X_img, known_face_locations=X_faces_loc)
closest_distances = knn_clf.kneighbors(faces_encodings, n_neighbors=1)
is_recognized = [closest_distances[0][i][0] <= DIST_THRESH for i in range(len(X_faces_loc))]
# predict classes and cull classifications that are not with high confidence
return [(pred, loc) if rec else ("N/A", loc) for pred, loc, rec in zip(knn_clf.predict(faces_encodings), X_faces_loc, is_recognized)]
def draw_preds(img_path, preds): """ shows the face recognition results visually.
:param img_path: path to image to be recognized
:param preds: results of the predict function
:return:
"""
source_img = Image.open(img_path).convert("RGBA")
draw = ImageDraw.Draw(source_img)
for pred in preds:
loc = pred[1]
name = pred[0]
# (top, right, bottom, left) => (left,top,right,bottom)
draw.rectangle(((loc[3], loc[0]), (loc[1],loc[2])), outline="red")
draw.text((loc[3], loc[0] - 30), name, font=ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 30))
source_img.show()
if name == "main":
for img_path in listdir("knn_examples/test"):
preds = predict(join("knn_examples/test", img_path), model_save_path="knn_examples/train/model/model.txt")
print(preds)
draw_preds(join("knn_examples/test", img_path), preds)`
Everything works well but am afraid if there will be similar faces i can get wrong result , and wont see some possible results, which can be true
My is_recognized result is [True]
Make your is_recogized to store all the boolean values for all 8 neighbours.
Currently we are not able to print the nearest neighbour names because our is_recognized is set in such a way that it stores only one neigbour.
Thats why only one [True] value is out. If we make is_recognized to store all the nearest neighbours boolean values, we could then try to print the names of all those is_recognized neighbouring values.
How can I do it? Can you help
yes, the below line prints all neighbours bool values
is_recognized = [(closest_distances[0][i] <= DIST_THRESH) for i in range(len(X_faces_loc))] print(is_recognized[0])
I kept n_neighbours as 2, So it prints out for me as [ True True]
Later, i am thinking how to parse through this is_recognized and relate it to for pred, rec in zip(knn_clf.predict(faces_encodings), is_recognized)
pred is the name of the person.
we are almost close. what do u think ?
I am sorry if this is the wrong way to print names. I am also new to this library. But this should work
is_recognized = [(closest_distances[0][i] <= DIST_THRESH) for i in range(len(X_faces_loc))] print(is_recognized[0]) gives the result^
(array([[0.34390288, 0.41492873]]), array([[3, 7]])) <generator object predict.
Bro, i got it! its really simple. we did too much.
knn_clf.kneighbors return two variables. Just change code like below,
closest_distances, indices = knn_clf.kneighbors(faces_encodings, n_neighbors=2)
Now we have link to the names of the original trianed data through indices :)
can you print whole code? im going crazy editing it million times)
Now we have link to the names of the original trianed data through indices :) - how can we print names nothing changed in my result
Actually i customized too many parts of it. It leads to more confusion. Can you perhaps print your indices ? It would be something like [1,2,3...]
Link this indices with training y data like:
for i in indices[0]:
print(y[i])#This gives the names of the classdir.
Optional: If you want to print the actual image file name..just change this in train function:
y.append(os.path.splitext(os.path.basename(img_path))[0])
what are indices when i print them in resuslt i see - [[3, 7]] i didnt train anything yet, only examples in knn_examples folder, + copy of them - 8 folders
Copy below code and tell me what happens
import os
from math import sqrt
from sklearn import neighbors
from os import listdir
from os.path import isdir, join, isfile, splitext
import pickle
from PIL import Image, ImageFont, ImageDraw, ImageEnhance
import face_recognition
from face_recognition import face_locations
from face_recognition.cli import image_files_in_folder
X = []
y = []
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg'}
def train(train_dir, model_save_path = "", n_neighbors = None, knn_algo = 'ball_tree', verbose=False):
"""
Trains a k-nearest neighbors classifier for face recognition.
:param train_dir: directory that contains a sub-directory for each known person, with its name.
(View in source code to see train_dir example tree structure)
Structure:
<train_dir>/
├── <person1>/
│ ├── <somename1>.jpeg
│ ├── <somename2>.jpeg
│ ├── ...
├── <person2>/
│ ├── <somename1>.jpeg
│ └── <somename2>.jpeg
└── ...
:param model_save_path: (optional) path to save model of disk
:param n_neighbors: (optional) number of neighbors to weigh in classification. Chosen automatically if not specified.
:param knn_algo: (optional) underlying data structure to support knn.default is ball_tree
:param verbose: verbosity of training
:return: returns knn classifier that was trained on the given data.
"""
for class_dir in listdir(train_dir):
if not isdir(join(train_dir, class_dir)):
continue
for img_path in image_files_in_folder(join(train_dir, class_dir)):
image = face_recognition.load_image_file(img_path)
faces_bboxes = face_locations(image)
if len(faces_bboxes) != 1:
if verbose:
print("image {} not fit for training: {}".format(img_path, "didn't find a face" if len(faces_bboxes) < 1 else "found more than one face"))
continue
X.append(face_recognition.face_encodings(image, known_face_locations=faces_bboxes)[0])
y.append(os.path.splitext(os.path.basename(img_path))[0])
if n_neighbors is None:
n_neighbors = int(round(sqrt(len(X))))
if verbose:
print("Chose n_neighbors automatically as:", n_neighbors)
knn_clf = neighbors.KNeighborsClassifier(n_neighbors=n_neighbors, algorithm=knn_algo, weights='distance')
knn_clf.fit(X, y)
if model_save_path != "":
with open(model_save_path, 'wb') as f:
pickle.dump(knn_clf, f)
return knn_clf
def predict(X_img_path, knn_clf = None, model_save_path ="", DIST_THRESH = .5):
"""
recognizes faces in given image, based on a trained knn classifier
:param X_img_path: path to image to be recognized
:param knn_clf: (optional) a knn classifier object. if not specified, model_save_path must be specified.
:param model_save_path: (optional) path to a pickled knn classifier. if not specified, model_save_path must be knn_clf.
:param DIST_THRESH: (optional) distance threshold in knn classification. the larger it is, the more chance of misclassifying an unknown person to a known one.
:return: a list of names and face locations for the recognized faces in the image: [(name, bounding box), ...].
For faces of unrecognized persons, the name 'N/A' will be passed.
"""
if not isfile(X_img_path) or splitext(X_img_path)[1][1:] not in ALLOWED_EXTENSIONS:
raise Exception("invalid image path: {}".format(X_img_path))
if knn_clf is None and model_save_path == "":
raise Exception("must supply knn classifier either thourgh knn_clf or model_save_path")
if knn_clf is None:
with open(model_save_path, 'rb') as f:
knn_clf = pickle.load(f)
X_img = face_recognition.load_image_file(X_img_path)
X_faces_loc = face_locations(X_img)
if len(X_faces_loc) == 0:
return []
faces_encodings = face_recognition.face_encodings(X_img, known_face_locations=X_faces_loc)
closest_distances,indices = knn_clf.kneighbors(faces_encodings, n_neighbors=1)
#is_recognized = [closest_distances[0][i][0] <= DIST_THRESH for i in range(len(X_faces_loc))]
# predict classes and cull classifications that are not with high confidence
for i in indices[0]:
print(y[i])#This gives the names of the classdir.
def draw_preds(img_path, preds):
"""
shows the face recognition results visually.
:param img_path: path to image to be recognized
:param preds: results of the predict function
:return:
"""
source_img = Image.open(img_path).convert("RGBA")
draw = ImageDraw.Draw(source_img)
for pred in preds:
loc = pred[1]
name = pred[0]
# (top, right, bottom, left) => (left,top,right,bottom)
draw.rectangle(((loc[3], loc[0]), (loc[1],loc[2])), outline="red")
draw.text((loc[3], loc[0] - 30), name, font=ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 30))
source_img.show()
if __name__ == "__main__":
knn_clf = train("knn_examples/train")
for img_path in listdir("knn_examples/test"):
preds = predict(join("knn_examples/test", img_path) ,knn_clf=knn_clf)
print(preds)
#draw_preds(join("knn_examples/test", img_path), preds)
Traceback (most recent call last):
File "face_recognition_knn.py", line 120, in <module>
knn_clf = train("knn_examples/train")
File "face_recognition_knn.py", line 49, in train
y.append(os.path.splitext(os.path.basename(img_path))[0])
NameError: name 'os' is not defined
when i change y to y.append(class_dir),and is_recognized = [closest_distances[0][i][0] to is_recognized = [closest_distances[0][i] i got the result (thanks God and you) but with error
Traceback (most recent call last):
File "face_recognition_knn.py", line 124, in <module>
draw_preds(join("knn_examples/test"), preds)
File "face_recognition_knn.py", line 109, in draw_preds
source_img = Image.open(img_path).convert("RGBA")
File "/home/user/.local/lib/python3.5/site-packages/PIL/Image.py", line 2543, in open
fp = builtins.open(filename, "rb")
IsADirectoryError: [Errno 21] Is a directory: 'knn_examples/test'
And one more question, as i mentioned before i divided script into 2 parts, 1 to train, 2nd to find result, but now if i do it wont get knn_clf am i wright?
Ah, that os error, i forgot to add import os in starting. I added now. Now copy paste code. Your scripts division doesnot effect anything. I commented out draw_preds function and concentrated only on printing names. When ever you are stuck at some time, try debugging in pycharm editor. It has inline dubugging feature. Also with print() statements
when i divide i got this:
Traceback (most recent call last):
File "face_recognition_knnpredict.py", line 73, in <module>
preds = predict(join("knn_examples/test", img_path) ,knn_clf=knn_clf)
NameError: name 'knn_clf' is not defined
if i remove knn_clf = train("knn_examples/train") as i did earlier, i get
Traceback (most recent call last):
File "face_recognition_knnpredict.py", line 71, in <module>
knn_clf = train("knn_examples/train/model/model.txt")
NameError: name 'train' is not defined
p.s. you did great job, script works well , 1 and i hope last problem remains -how can i make work separately 'train' and 'predict'
what i understand is you want to keep train in x.py file and predict or preds functions in y.py file. Dont keep main in both x.py and y.py files. The main() function will always be in one .py file. Access all other files in to this main .py file through import statement.
Why do you want to seperate these functions to different files ? It is not that big file. You can keep all in one file if errors keep coming
I want to train once and then compare test faces with my model. If I use full script it will rewrite my model and will take time to train again. We have a working script, every time i run it, it re-trains photos stored in 'train' folder, and compares it with photo in 'test' folder. but my photos in train folder stay same i dont want to lose time retraining it everytime
u need to seperate the train function to 3 parts:
def scan() return X,y
def store() call scan() and dump(X,y)
def retrieve() return X,y
You just need to keep the training results in a txt file at the first time. Like this: knn_clf = train("knn_examples/train", "./model/model.txt") and then preddict like this: preds = predict(join("knn_examples/test", img_path), model_save_path="./model/model.txt") If the training data remains the same, you don't need to training model.
Also as we discussed, kneighbour indices linking part...here, just call retrieve function to get y index. Link indices from knn.kneighbours with stored y value from txt file
thank you for everything , but im new in python i dont understand anything without a full example, i decided to use another script and it works well for me:
import face_recognition
import pickle
import numpy as np
from PIL import Image, ImageFont, ImageDraw, ImageEnhance
# Load face encodings
with open('dataset_faces.dat','rb') as f:
all_face_encodings = pickle.load(f)
# Grab the list of names and the list of encodings
face_names = list(all_face_encodings.keys())
face_encodings = np.array(list(all_face_encodings.values()))
# Try comparing an unknown image
unknown_image = face_recognition.load_image_file("test.png")
unknown_face = face_recognition.face_encodings(unknown_image)
result = face_recognition.api.compare_faces(face_encodings, unknown_face, tolerance=0.5)
names_with_result = list(zip(result, face_names))
with open("file.txt", "w") as file:
print(names_with_result, file=file)
It takes my early trained file dataset_faces.dat and compares with file test.png , with tolerance 0.5 as a result it creates a file with both true or false results and filename of trained pictures. I think it is enough for me. Can you help me once more? i want that script to print only true results with names of trained pictures. i think its not difficult for you.
result = face_recognition.api.compare_faces(face_encodings, unknown_face, tolerance=0.5)
if result: # Get only True values
names_with_result = list(zip(result, face_names))
print(names_with_result)
if result: ??? and thats all? or i need to put True somewhere
it returns all values((
i want to make a database, that contains about 5.000 of images. I use face_recognition_knn.py and it works fine but, how can i organize it, for make recognition faster. For example i import 1 photo, it search about 30 minutes from 5.000 of images, after that im importing second photo and it also takes 30 minutes. Can i 'index' that 5000 photos or something like that. all 5.000 photos will constantly stay in one folder. I didnt actually understood how can i do model retraining.
p.s. sorry for my english