filipbasara0 / relic

A simple PyTorch implementation of the Representation Learning via Invariant Causal Mechanisms self-supervised contrastive learning paper
MIT License
10 stars 3 forks source link

applying on my own dataset #2

Open zahrakhanjani128 opened 6 months ago

zahrakhanjani128 commented 6 months ago

Hi, I have a dataset including spectrogram photos extracted from audio data, I would love to apply ReLIC on it to see if it helps with my downstream task or not. Could you please guide me how to apply ReLIC on my own dataset? Thanks a lot in advance!

filipbasara0 commented 6 months ago

Hey, thanks for taking interest!

That sounds like a really fun experiment and I would love to see the results!

Do you have a way to generate a dataset of positive and negative samples? This would required either a more specific augmentation pipeline or you could maybe leverage the temporal structure of audio files. For example, you could divide an audio file into segments, create spectrograms for each segment and use them as positives. Spectrograms from other audio files would be used as negatives.

Let me know if this makes sense or if you have another approach on your mind. Once we have good positive and negative samples, we can easily apply ReLIC to your problem!

zahrakhanjani128 commented 6 months ago

Thank you so much for your response and the great idea. I really appreciate it! Some of my audio files are AI-generated (fake) and some genuine audio samples (real). What if I use fake ones as positive and real ones as negative? Does it work? Then instead of extracting spectrograms based on each audio segment, we can extract that for the entire audio clip. The downstream task is detecting fake samples.

zahrakhanjani128 commented 5 months ago

Filip, I wait for your great idea on this!

filipbasara0 commented 5 months ago

Hi, sorry for the late reply!

You could still try using ReLIC, but I think your problem is a better setup for a binary classification task, since you have a way to generate positive and negative samples.

I think that you could achieve great results by training a CNN or fine-tuning a spectrogram transformers model, depending on the volume of the data.

Wish you all the best with your project!

zahrakhanjani128 commented 5 months ago

Hi Filip, Thank you so much for your guidance, I have done some CNN based models too, I am a student trying to solve a project on fake audio detection. A little confusing! Does not ReLIC work for a binary classification problem? Should it be only multi-class problems to be solved by ReLIC? I remember in the paper they have an example of Cat and Dog classification. Many thanks in advance for any other guidance and clarification