Open tyang92 opened 4 years ago
Yup, seems like a bug. It seems like it is not finding any valid solution at all when you set length < t.
Is there anyway you could avoid using lengths on alignment? The Alignment code is a bit gnarly, so I am guessing I just messed up the length constraint. An alternative method is just to pad your sequences with matches on the boundaries.
Hello Dr. Rush,
Many thanks for your rapid response and diagnosis of the problem. My data is intrinsically variable length, so I appreciate your help thinking through alternatives to lengths
while the bug stands.
If I understand your suggestion, you are recommending I create fake log_potentials
by padding both my ground truth sequences and predicted logits with match values? How do I determine what the match value should be? (in your example you use the predicted logit at the true label index). Should I just use 0 so that exp(0) = 1 (certain match). How would this effect dist.argmax
? And is it the case that if I follow this procedure exactly my log_marginal probability should not change (intuitively, it seems like this should change the partition function and therefore my marginal)
Thank you again! Tim Y.
Dear Dr. Rush,
First, thank you for writing the excellent arxiv explainer on this library! It very much helped me understanding. I have experimented with padding with matches, but I am not confident in my choice of log potentials as I still do not fully understand how this will effect the partition function and argmax. I wonder if you might confirm that 0 is the proper choice of log potential for certain match? Or perhaps it would be possible to write a wrapper which does this padding automatically for a given length constraint.
Either way, thank you and hope to hear from you soon. Best, Tim
Hi Tim,
Sorry for the slow reply. I started looking at the issue, but I am still not happy with the speed of this particular function and am worried it might not useful. Are you using it in a production system or a demo?
Either way I will try to get a padded demo this week, but I want to make sure you are not waiting on a function that will not be directly applicable.
Hi Dr. Rush, Thanks so much for the info. My project is a research prototype so a big performance hit should be just fine if I can get something running at all.
I look forward to the padding demo, and once again very grateful for your help! Tim
Hi Dr. Rush and struct team,
Hope all is well with the virus and what not. I know it must be very hectic for you, but I wanted to check in on this. I'm now basically stuck at home and have a lot of time for this project and was hoping I could get a prototype of the alignment working.
Let me know if you have any updates or could point me in the right direction. Thank you! Tim
I'm so sorry, things have been a bit crazy. I will try to get to this this week.
Hi, I finally got around to making you a full example:
https://github.com/harvardnlp/pytorch-struct/blob/master/notebooks/CTC_with_padding.ipynb
Let me know if it is clear. I will eventually get his checked in under the lengths command.
This looks like exactly what I needed! Thank you SO much Dr. Rush, best wishes and stay safe
Awesome. I am going to leave it open so I remember to build it in.
Hello! First thanks for this awesome project. I think it will make a big difference in the NLP community and I am already excited to use it in my work.
Problem TLDR: AlignmentCRF.log_prob returns values greater than 0 (marginal probabilities greater than 1. I think this might be a bug unless I'm misusing the API.
More background
I'm attempting to use an Alignment CRF (smith waterman) as the loss function for a generative model. The "true" labels are sequences of tokens of various length, and I like the alignment because it will still work if my model accidentally adds an extra token. Obviously my data is variable length, so I provide the lengths tensor argument which has the length of my true labels in each batch. When I do this and compute
dist.log_prob(dist.argmax)
I get some batches who have log likelihood which is very positive. I checked the partition function and it looks very negative, so my guess is the error might have to do with masking in the partition but not the score or vice-versa.To reproduce this error
I get the same are in the CTC.ipynb example in the notebooks folder when I initialize the distribution with
dist = torch_struct.AlignmentCRF(log_potentials, lengths=torch.Tensor([t] * 2 + [t-1] * 3).long())
and thendist.log_prob(dist.argmax)
. Image of the error is below. Here is a collab notebook which has the error https://colab.research.google.com/drive/1C1uDWNe8IcXB-Re6WcnwRsif-q-JQySe.Do you agree this is a bug or is this the intended behavior? Any suggestions for how I might debug this or correctly use your API would be much appreciated. I am not familiar with SemiRings but if you explain what might fix the issue I can try to submit a pull request (still fairly new to pytorch so I'm not sure how much help I can be).
Thanks for your attention and again for the great library! Tim Y.