CUNY-CL / yoyodyne

Small-vocabulary sequence-to-sequence generation with optional feature conditioning
Apache License 2.0
25 stars 15 forks source link

Pad tensor after eos #201

Closed Adamits closed 1 week ago

Adamits commented 1 week ago

In #184 I seem to have forgotten to test predict.py. It attempts to get a handle on the old evaluator in order to finalize_predictions, but now we have multiple evaluators and don't know which one to use.

Instead, this generalizes the notion of replacing everything after EOS in a tensor with a PAD and puts it in util.py. We can now use that for prediction in case the model predicts random chars after EOS. I also made minor clarifications to evaluators.py, where we sometimes finalize_* without replacing with PADs, as they are unnecessary if we aren't keeping a tensor as in the SER case.

I will need to test the evaluators beforemerging, but was hoping to get eyes on this for design or naming issues.

Adamits commented 1 week ago

I like the idea of differentiating padding and EOS, since we have both of them in the vocabulary. (I can't imagine it would hurt and it might help.) And since that's a scary function that occurs in a few places I like it living in util. No objections here. LGTM. Let me know when you're ready for me to merge it.

Sounds good! Going to run a test with all evals and then should be good.

Adamits commented 1 week ago

Ok ready to merge if there are no other feedback!

bonham79 commented 1 week ago

@kylebgorman mind holding off merge until after 12pm your time? (Finishing up laundry)

Adamits commented 1 week ago

@kylebgorman mind holding off merge until after 12pm your time? (Finishing up laundry)

@bonham79 Are you not in NY? What area are you in these days

bonham79 commented 1 week ago

@kylebgorman mind holding off merge until after 12pm your time? (Finishing up laundry)

@bonham79 Are you not in NY? What area are you in these days

Oh I'm In SF these days. That's why I'm a tad later on commenting. (That and day job.)

Did you check predict function on the edit action and hmm transducers? I've noticed time to time the evaluator changes don't percolate to those models.

Adamits commented 1 week ago

@kylebgorman mind holding off merge until after 12pm your time? (Finishing up laundry)

@bonham79 Are you not in NY? What area are you in these days

Oh I'm In SF these days. That's why I'm a tad later on commenting. (That and day job.)

Oh nice im just south of SF for the summer, we should get a coffee or something!

Did you check predict function on the edit action and hmm transducers? I've noticed time to time the evaluator changes don't percolate to those models.

I will check. I think it should work after https://github.com/CUNY-CL/yoyodyne/pull/200 though.

bonham79 commented 1 week ago

Oh nice im just south of SF for the summer, we should get a coffee or something!

San Jose South or Santa Cruz south?

Adamits commented 1 week ago

Oh nice im just south of SF for the summer, we should get a coffee or something!

San Jose South or Santa Cruz south?

Apt is in San Carlos, so not very south of SF.

bonham79 commented 1 week ago

Oh nice im just south of SF for the summer, we should get a coffee or something!

San Jose South or Santa Cruz south?

Apt is in San Carlos, so not very south of SF.

oh yeah that's my commute. yeah we getting coffee. I'll email you off thread.

Adamits commented 1 week ago

Ok I do get an error when running the predict script for Transducer but it seems unrelated. I have been calling predict with

--target_col 0 in order to make sure we do not do teacher forcing or similar. For Transducer, this causes:

File "/Users/adamwiemerslage/nlp-projects/morphology/yoyodyne/yoyodyne/models/transducer.py", line 548, in predict_step
    predictions, _ = self.forward(
  File "/Users/adamwiemerslage/nlp-projects/morphology/yoyodyne/yoyodyne/models/transducer.py", line 124, in forward
    target_mask=batch.target.mask,
AttributeError: 'NoneType' object has no attribute 'mask'

We should not actually need the target col at predict time right?

bonham79 commented 1 week ago

We should not actually need the target col at predict time right?

No we should not. Okay, ignore edit action transducer and I'll update once your merge is done. (No point tacking on unrelated changes.)