Open min-xu-ai opened 3 years ago
@min-xu-ai Is there an action item to follow up here? Which of the above listed issues still occur and what is the priority to fix them?
I think the backward firing cases is improved a lot since then. @zhaojuanmao
Different wrapping order may still have issues since we don't test all the combinations exhaustively.
@min-xu-ai Does it make sense to take any of the above issues or should we figure out the larger issue behind the behaviors mentioned above? It seems like checkpoint_wrapper may not have the most consistent behavior with FSDP.
unit test are being added here: https://github.com/facebookresearch/fairscale/pull/476
but I don't have a big picture of what's needed to be fixed yet. Some observations:
cc: @prigoyal @myleott
I am going to document a list of issues we found so far for tracking purpose.