Open nora-illanes opened 4 years ago
Why do you feel like you need to reduce dimensions? I would recommend you just use the full 512 vector. I recently trained on the full 3.3 million images from vggface2 and it went okay. There shouldn't be any reason why you want to reduce these from a classification perspective. Of course you might run out of RAM while loading all the data, so you should batch data during training using a neural network like a simple hidden layer into softmax using fit_generator from keras or something.
If you want to try it, just perform PCA:
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
This will usually create more general vectors and you will probably lose a lot of accuracy I would guess. Unless you are looking for features that measure similarity then this would be a good idea.
Why do you feel like you .....There shouldn't be any reason why you want to reduce these from a classification perspective.
I need to create a library of embedding vectors to save it to a gigantic file.
If you want to try it, just perform PCA: .... Unless you are looking for features that measure similarity then this would be a good idea.
Hmmm, I am looking for features of similarities. Thank you for your reply. The numbers in the vector are of huge dimensions, but I guess it is like that.
Thank you again!
How about training to get your own model ? you can specify --embedding=128 when calling train_*.py script
Thanks for your reply!
How about training to get your own model ? you can specify --embedding=128 when calling train_*.py script
I will try this. Why are you running train_? I have only one script from facenet repository, and that is train.py.
you can find train_tripletloss and train_softmax in src folder
Maybe you can apply PCA to 512-d vectors and reduce them to 128-d.
I would agree with @ryecomp here. As long as reducing dims doesn't affect performance. @nora-illanes siamese nets can be trained with a triplet loss to reference euclidean distance later or softmax activation in the attempt of classification.
Hello, Have you managed to turn 512 dimensions into 128 dimensions?How do you do?
Why do you feel like you need to reduce dimensions? I would recommend you just use the full 512 vector. I recently trained on the full 3.3 million images from vggface2 and it went okay. There shouldn't be any reason why you want to reduce these from a classification perspective. Of course you might run out of RAM while loading all the data, so you should batch data during training using a neural network like a simple hidden layer into softmax using fit_generator from keras or something.
If you want to try it, just perform PCA:
https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
This will usually create more general vectors and you will probably lose a lot of accuracy I would guess. Unless you are looking for features that measure similarity then this would be a good idea.
Actually there are, the KNN classifier is heavily affected by the curse of dimensionality. If you have only a few training samples, 128-d embedding or even 64-d embedding are a better choice.
Как насчет обучения, чтобы получить собственную модель? вы можете указать --embedding=128 при вызове скрипта train_*.py
Can you please tell me why I get a very low accuracy of about 0.012 when training my own model? I am using the following script. Number of images in the dataset 1020, classes 30 @ryecomp
python src/train_softmax.py \
--logs_base_dir ~/logs \
--models_base_dir ~/models \
--data_dir ~/test-database-x-182 \
--image_size 160 \
--model_def models.inception_resnet_v1 \
--optimizer ADAGRAD \
--learning_rate -1 \
--max_nrof_epochs 150 \
--keep_probability 0.8 \
--random_crop \
--random_flip \
--learning_rate_schedule_file ./data/learning_rate_schedule_classifier_msceleb.txt \
--weight_decay 5e-5 \
--center_loss_factor 1e-2 \
--center_loss_alfa 0.9 \
--validate_every_n_epochs 1 \
--validation_set_split_ratio 5\
--prelogits_norm_loss_factor 5e-4 \
--embedding_size 128 \
--batch_size 34
Hello everybody,
I'm using either 20180408-102900 or 20180402-114759 for .pb pre-trained model in a very small set of images. It works well using classify.py (I'm only interested in the embedding). However, I really need the 128 element vector, since the final dataset will contain 10*10^3 images.
How can I make the embedding coming from the LFW-trained-model return such a vector?
Really need this help.