Open thkdlflghlsdghilr opened 5 years ago
Hello thkdlflghlsdghilr! This is kinda what I did at the beginning. I scraped images of the top 48 most famous people for each day of the year, so that's where the 13,000 images came from. The numpy array "denseArray27K.npy" has all the encodings of these celebrities stored as a numpy array.
As for adding new images, check out the readme! thanks to jontonsoup4, adding new images is as easy as running one command.
Like carykh said, bulk importing of faces has been added! All you need to do is write a scraper to gather images from your site of choice, output them to the extra_images
folder, and then run ./add_extra_images.sh
. For scraping, I'd recommend using requests
with beautifulsoup4
(where I get my name from 😉). You could also extend the image encoder to skip human input if you're dealing with a massive dataset and have taken care of file name sanitation already. The default name function I wrote works best with snake_case file names (firstname_lastname.jpg
).
Oh, ok cool! I just skimmed the readme and watched the video you made instead and you didn't talk about it to my knowledge.
Is there a way to remove old images, like completely swap out the data set(in order to make it for flowers in my case)
This is half an issue but half a suggestion. so first make an image scraper to scrap all faces from https://www.famousbirthdays.com/ and then make a python program that encodes a folder of the scraped images into the system. This could greatly improve user performance due to the complicated(to some people who are not me) proses of encoding a face. I can code this if you want. I don't really know why I'm telling you this if I can make it.
P.S. It will please me if you let me make it and then include it in the repository.