HHousen / speaker-change-detection

Speaker change detection using SincNet and an LSTM/Transformer
GNU General Public License v3.0
36 stars 6 forks source link

Question for AUROC & validation step #6

Open Sean652039 opened 1 month ago

Sean652039 commented 1 month ago

Hi Housen, about the metrics part, you defined the num_classes = num_frames if scd is True, but in the initial part, the num_classes = 1 if scd is True. Could you please tell the reason behind this?

AUROC( self.num_frames if self.scd else self.num_classes, pos_label=1, average="macro", compute_on_step=False, )

And also, shall we define the task type? Since segmentation is multilable task while scd is binary task. like this task="binary" if self.scd else "multilabel",

Sean652039 commented 1 month ago

About the another issue I met, there is an error in "model.py, line 220, in validation_step self.validation_metric(y_pred.squeeze() if self.scd else torch.transpose(y_pred, 1, 2), target.squeeze() if self.scd else torch.transpose(target, 1, 2), )"

I set scd True, and I met "prd_batch_size, prd_num_speakers, prd_num_frames = preds.shape ValueError: not enough values to unpack (expected 3, got 2)". So I'm curious that why do we need to use .squeeze() function to get rid of num_classes, which equals to 1, and by doing this, there is an error.