lf1-io / padl

Functional deep learning
Apache License 2.0
106 stars 4 forks source link

Issue for reimplementation (Debugger not tracking the error properly when trivial Identities) #340

Open eramdiaz opened 2 years ago

eramdiaz commented 2 years ago

🐞 Bug

When we get an exception on a transform that has useless Identity()s, the debugger is not able to track correctly which element of the entire transform is getting the Exception.

Example: Let t be the following Transform

t = prep_1 >> Identity() >> Identity() >> batch >> Identity() >> emb >> linear

and suppose we are getting an Exception on linear. pd_splits does not take into account the useless Identities, so t.pd_preprocessis

prep_1 >> batch

and t.pd_forward is

emb >> linear

so when we catch the Exception on t.pd_forward, the item we get failing is 1, and when we are trying to track this index back in the entire transform we get:

len(t.pd_preprocess)` + failing_item = 2 + 1 = 3

so we would get that the transform that is faling is batch, which is wrong.

For solving this, we would need to count how many useless Identities are before the index failing, and I have not found a solution with the information we've got (the index of the exception on the stage that fails).

sjrl commented 2 years ago

@wuhu Would one solution to this be to never remove the useless Identities?

wuhu commented 2 years ago

@sjrl I think so. We would end up with a lot of identities in the splits though (this is why we remove them), possibly we could somehow not remove them when they were in there in the first place (in the original transform). Anyhow, I think this is a rather edgy edgecase and I think we'll probably be able to solve this easier once we move to the graph model. Therefore, I suggest leaving it as it is for now.

eramdiaz commented 2 years ago

This issue was tackled on the pr https://github.com/lf1-io/padl/pull/405, where the first arrow of the debugger was removed due to this issue (https://github.com/lf1-io/padl/issues/340) and on https://github.com/lf1-io/padl/issues/404. At the moment we'll keep it open to take into account on a future implementation of this arrow.