Open jeeveenn opened 8 months ago
it is very strange, have you also tested the training script of https://github.com/tencent-ailab/IP-Adapter/blob/main/tutorial_train_faceid.py
I'll give it a try
Hello. Could you explain the json file?
--data_json_file="*.json" \
I want to train the model but I'm not sure about how to make the json file.
list of dict: [{"image_file": "1.png", "id_embed_file": "faceid.bin"}]
I extract id embedding offline and save to faceid.bin
Thank you. Let me try that.
list of dict: [{"image_file": "1.png", "id_embed_file": "faceid.bin"}]
I extract id embedding offline and save to faceid.bin
will it be ok like?
# Load face encoder
app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))
face_image = load_image("./girl2.jpg")
face_image = resize_img(face_image)
face_info = app.get(cv2.cvtColor(np.array(face_image), cv2.COLOR_RGB2BGR))
face_info = sorted(face_info, key=lambda x:(x['bbox'][2]-x['bbox'][0])*(x['bbox'][3]-x['bbox'][1]))[-1] # only use the maximum face
face_emb = face_info['embedding']
1、how to transfer the face_emb to faceid.bin correctly? after directly saved to file , errors like this
_pickle.UnpicklingError: Caught UnpicklingError in DataLoader worker process 0.
_pickle.UnpicklingError: invalid load key, '$'.
it may be the file format error?
2、data.json file
{
"image_file": "faceimage.jpg",
"text": "a hansome man",
"id_embed_file": "0321faceid.bin"
},
faceimage.jpg must be same with face_info size? or other size? i mean the originial face box of a image not always square like 256256, 512512 , if we resize it forcibly, the ground truth of a face will be destroyed.
1) I use torch.save(face_info['embedding'], "faceid.bin") 2) a norm way is resize the short size to 512, then center crop. (you can also center crop with the help of face bounding box)
- I use torch.save(face_info['embedding'], "faceid.bin")
- a norm way is resize the short size to 512, then center crop. (you can also center crop with the help of face bounding box)
hi @xiaohu2015 , i got some new issuses train with "tutorial_train_faceid.py" then transfer "pytorch_model.bin" to adapter.bin with parameters "ip_adapter,xxx" infer with "ip_adapter-full-face_demo.ipynb" in Line "ip_model = IPAdapterFull(pipe, image_encoder_path, ip_ckpt, device, num_tokens=257)"
--> 139 self.image_proj_model.load_state_dict(state_dict["image_proj"])
140 ip_layers = torch.nn.ModuleList(self.pipe.unet.attn_processors.values())
141 ip_layers.load_state_dict(state_dict["ip_adapter"])
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:2041, in Module.load_state_dict(self, state_dict, strict)
2036 error_msgs.insert(
2037 0, 'Missing key(s) in state_dict: {}. '.format(
2038 ', '.join('"{}"'.format(k) for k in missing_keys)))
2040 if len(error_msgs) > 0:
-> 2041 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
2042 self.__class__.__name__, "\n\t".join(error_msgs)))
2043 return _IncompatibleKeys(missing_keys, unexpected_keys)
RuntimeError: Error(s) in loading state_dict for MLPProjModel:
Missing key(s) in state_dict: "proj.3.weight", "proj.3.bias".
Unexpected key(s) in state_dict: "norm.weight", "norm.bias".
size mismatch for proj.0.weight: copying a param with shape torch.Size([1024, 512]) from checkpoint, the shape in current model is torch.Size([1280, 1280]).
size mismatch for proj.0.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([1280]).
size mismatch for proj.2.weight: copying a param with shape torch.Size([3072, 1024]) from checkpoint, the shape in current model is torch.Size([768, 1280]).
size mismatch for proj.2.bias: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([768]).
guess some dim or shape size mismatch between training and infer stage? What is the differ with tutorial_train_faceid.py and tutorial_train_plus.py, and there is no special "ip_adapter-face.ipynb" file corresponded with tutorial_train_faceid.py
how is the progress? noticing that you are training with original SD1.5 "runwayml/stable-diffusion-v1-5" did anyone try some community models like "SG161222/Realistic_Vision_V6.0_B1_noVAE" as traning based model?
Hello, thank you for your excellent work. When I was training faceid_plus, as the training step increased, the generated images became worse. I used about 1,500,000 images for training. What may be the reason? the train code:
the generated images,first picture: 10000 steps; second picture: 270000 steps