selimsef / dfdc_deepfake_challenge

A prize winning solution for DFDC challenge
MIT License
787 stars 209 forks source link

I encountered continuous target data is not supported with label binarization #10

Closed xysong1201 closed 4 years ago

xysong1201 commented 4 years ago

I encountered this issue during validation

  File "finetune_xy.py", line 446, in <module>
    main()
  File "finetune_xy.py", line 303, in main
    summary_writer=summary_writer)
  File "finetune_xy.py", line 311, in evaluate_val
    bce, probs, targets = validate(model, data_loader=data_val)
  File "finetune_xy.py", line 366, in validate
    fake_loss = log_loss(y[fake_idx], x[fake_idx], labels=[0, 1])
  File "/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py", line 73, in inner_f
    return f(**kwargs)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 2206, in log_loss
    transformed_labels = lb.transform(y_true)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/preprocessing/_label.py", line 491, in transform
    sparse_output=self.sparse_output)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py", line 73, in inner_f
    return f(**kwargs)
  File "/opt/conda/lib/python3.7/site-packages/sklearn/preprocessing/_label.py", line 680, in label_binarize
    "binarization" % y_type)
ValueError: continuous target data is not supported with label binarization
[1]+  Exit 1                  nohup python -u finetune_xy.py --config configs/b7.json > log.out

could you explain a little bit the

data_x = []
    data_y = []
    for vid, score in probs.items():
        score = np.array(score)
        lbl = targets[vid]

        score = np.mean(score)
        lbl = np.mean(lbl)
        data_x.append(score)
        data_y.append(lbl)
    y = np.array(data_y)
    x = np.array(data_x)
    fake_idx = y > 0.1
    real_idx = y < 0.1
    fake_loss = log_loss(y[fake_idx], x[fake_idx], labels=[0, 1])
    real_loss = log_loss(y[real_idx], x[real_idx], labels=[0, 1])
    print("{}fake_loss".format(prefix), fake_loss)
    print("{}real_loss".format(prefix), real_loss)

in your code? Thank you