keras-team / keras-hub

Modular Natural Language Processing workflows with Keras
Apache License 2.0
763 stars 228 forks source link

StartEndPacker left padding #1781

Open Leo-Lifeblood opened 1 month ago

Leo-Lifeblood commented 1 month ago

Is your feature request related to a problem? Please describe.

Im working on a causal regression model and not having a left padding option is making my life more difficult than it has to be

Describe the solution you'd like

i'd like a parameter to be added to StartEndPacker that allows me to switch the padding side from right to left

divyashreepathihalli commented 2 weeks ago

@Leo-Lifeblood Thanks for filing an issue. In general, causal models typically do not require left padding. This is because causal models focus on understanding the effects of past events on future outcomes. Padding on the left, which adds information or placeholders before the actual sequence, could potentially introduce artificial dependencies or distort the temporal relationships that the model is trying to learn.

Right padding, on the other hand, is often used in causal models to ensure that all sequences in a batch have the same length, which is necessary for efficient processing by many deep learning frameworks. Right padding adds information or placeholders after the actual sequence, which doesn't interfere with the causal relationships the model is trying to capture.

So, we do not think it is a good idea to add this feature. However, if your use-case demands it, please feel free to subclass the layer and modify it.

Leo-Lifeblood commented 2 weeks ago

I mean sure in most cases yes it is a bad idea, however, that goes for most things in life for instance It's generally not a good idea not to drive on the left side of the road unless you are in the UK in which case its a very good idea. I am working with sections of text where I must use left padding for highly context specific reasons and in just such cases having that option and others like it makes the library substantially more useable. Adding it as an option does not seem very difficult and would add a lot to the flexibility of the system in my eyes. I don't see why you need guardrails for the community of keras users by preventing them from choosing to use left padding. If you understand enough to see that choice im sure you also understand enough to make a sensible decision when faced with it.