Open captainproton1971 opened 4 years ago
Hi, just wondering if anyone has started to look at this?
Hi @captainproton1971. Unfortunately haven't had the time to start this and don't personally expect to for a while.
@liyunlu0618
Unassigning self since there should be an assignment only if there is active work on it.
Hi @captainproton1971, sorry for really late response.
Just want to check if this still bugs you before we start triage.
Hi, thanks @teijeong . I'll check this weekend to see if it's still causing problems and post a reply.
Describe the bug Training a sparse model including an Masking Layer and an LSTM, and then stripping model with strip_pruning() produces models that handle the masking differently. This means the models trained with pruning are not useable for inference after stripping.
System information
Python version:
Describe the expected behavior Models with a prune_low_magnitude LSTM layer should generate same output as the same model stripped model.
Describe the current behavior A model in which a prune_low_magnitude LSTM layer is fed by a masking layer appears to handle the masking differently than its stripped version. Moreover, it's not a simple ignoring of the sequence mask.
Code to reproduce the issue
Check behaviour of outputs
Including the masking
A quick inspection shows that x1_a, x1_b, and x2_a are very different from each other. I've also confirmed that the layer configurations are the same in the two models.
Additional context I didn't see anything in the documentation re: behaviour with masked inputs to LSTM layers but the current behaviour (handle it differently than either the 'usual' or ignoring the mask complexly) seems counter-intuitive.
I found this problem after training and pruning a masked LSTM model, stripping it and finding that the output was not consistent with the pruned model.
Thank you in advance for any help you can offer.