Open xellDart opened 2 years ago
You can decode the train.rec and train.idx to jpegs and texts directly. Here is some code snippets you can refer
imgrec = mx.recordio.MXIndexedRecordIO(args.idx_path, args.bin_path, 'r')
s = imgrec.read_idx(0)
header, _ = mx.recordio.unpack(s)
print(header.label)
imgidx = list(range(1, int(header.label[0])))
for i in imgidx:
img_info = imgrec.read_idx(i)
header, img = mx.recordio.unpack(img_info)
label_int = int(header.label)
you can save img data and label_int here
```
@wjxzju sorry for reopen, but I try to extract quality features for train model using
python3 generate_pseudo_labels/extract_embedding/extract_feats.py
with this config
class Config:
# dataset
data_root = ''
img_list = 'DATA.labelpath'
eval_model = 'generate_pseudo_labels/extract_embedding/model/SDD_FIQA_checkpoints_r50.pth'
outfile = '../feats_npy/Embedding_Features.npy'
# data preprocess
transform = T.Compose([
T.Resize((112, 112)),
T.ToTensor(),
T.Normalize(mean=[0.5,0.5,0.5], std=[0.5,0.5,0.5]),
])
# network settings
backbone = 'R_50' # [MFN, R_50]
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
multi_GPUs = [0]
embedding_size = 512
batch_size = 512
pin_memory = True
num_workers = 6
config = Config()
Im usign RTX 3090 and 32 gb of ram with this batch size, embedding_size = 512 batch_size = 512
All is god, but on finish step I get:
Number of samples: 5822653
Sample_num = 5822653
100%|████████████████████████████████████▉| 11372/11373 [49:48<00:00, 3.81it/s]
Traceback (most recent call last):
File "/home/miguel/Documentos/TFace/quality/generate_pseudo_labels/extract_embedding/extract_feats.py", line 96, in <module>
try: feats[start_idx:end_idx, :] = embeddings
ValueError: could not broadcast input array from shape (189,) into shape (189,512)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/miguel/Documentos/TFace/quality/generate_pseudo_labels/extract_embedding/extract_feats.py", line 98, in <module>
except: feats[start_idx:, :] = embeddings
ValueError: could not broadcast input array from shape (189,) into shape (189,512)
What is the correct value for batch_size and embedding_size for my hardware configuration?
Thanks!
Hi, @xellDart
According to your log, the cause of error seems like the fact that the shape of
Thank you.
HI, I try to train quality model using :
MS1M-ArcFace (85K ids/5.8M images) [5,7] from https://github.com/deepinsight/insightface/tree/master/recognition/_datasets_
But this has all train data in .rec file, I can understand if I can use this dataset whit quality model, I search for a method to decode this, but I cant do this.
I already have train.rec and train.idx