Unintentionally commented-out line in 02-pytorch-asr main_ctc.py?

jeremyfix / deeplearning-lectures

Deep learning lecture materials

https://teaching.pages.centralesupelec.fr/deeplearning-lectures-build/

Other

31 stars 13 forks source link

Unintentionally commented-out line in 02-pytorch-asr main_ctc.py? #22

Open NickleDave opened 2 years ago

NickleDave commented 2 years ago

Hi again @jeremyfix

I noticed in this solution that this comment appears to contain another line of code that maybe should not be commented out?

https://github.com/jeremyfix/deeplearning-lectures/blob/b3862d6dd1af45bea1a99f9b26a0c8baa1520422/LabsSolutions/02-pytorch-asr/main_ctc.py#L42

shouldn't it actually be

# compute the log_softmax 
unpacked_predictions = unpacked_predictions.log_softmax(dim=2)  # T, B, vocab_size

so that you transform the "logits" to log softmax?
If there's some reason you're not converting to log softmax on purpose, I'd be curious to know

jeremyfix commented 2 years ago

Good catch , that's actually a mistake;

wrap_ctc is indeed called before CTCLoss which is itself expecting the log probabilities.

That part of code is also not expected to be different from the base code provided to be filled in.

jeremyfix commented 1 year ago

@NickleDave I'm kind of reopening that issue; I do realize that this lab work is not perfectly working.

At least tested on a the french corpus of CommonVoice, my code fails to overfit a single minibatch and fails to overfit the training set; However training on a larger corpus, It ends producing some recognition that looks like the groundtruth ; There must be still be a bug somewhere. I spent some times looking all along the code, I'm not able to catch anything wrong;

As you were digging deep into the code, did you possibly discovered other issues in the code ? Did it you work when you tried it , maybe on other languages than French ?

Thank you for your insights.

NickleDave commented 1 year ago

Hi @jeremyfix thank you for letting me know about this.

Long story short, we are extending a previously designed model for annotating birdsong: https://github.com/yardencsGitHub/tweetynet
But I am in the middle of a big revamp of the framework we use to run experiments: https://github.com/vocalpy/vak/tree/version-1.0 Would expect to be back running experiments by the end of Feb.

Mainly I was looking at your code since it's one of the only good detailed examples I could find of using the rnn.utils API for a model that is not pure NLP.

I haven't discovered any other issues but I will definitely tell you if I do.