hkchengrex / XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
https://hkchengrex.com/XMem/
MIT License
1.72k stars 191 forks source link

Added support for MPS Apple Silicon and quality of life improvements to the GUI #103

Closed jjhickmon closed 1 year ago

jjhickmon commented 1 year ago

With increased MPS support from PyTorch, this feature will be necessary for users to run XMem on-device. Provided is a brief summary of the changes:

A few remaining issues include:

hkchengrex commented 1 year ago

Hi @jjhickmon, thanks a lot for the contribution. I don't have time to look at it right now -- feel free to ping me if I didn't reply within a week.

jjhickmon commented 1 year ago

Hi @hkchengrex thanks for the comments! I can raise the revision after you address the comment I left about torch.argmax. Along with this, while you were reviewing the PR I was able to fix the issue with the original shortcuts not working on Mac. Would you prefer I update this PR or raise a new one after?

I also saw undo was partially implemented and I have a working implementation locally, but would you prefer the published code to not have the undo functionality?

hkchengrex commented 1 year ago

Hi,

I have made some minor changes to the aesthetic (PyQt6 does not expand the draw area properly). I have also changed the video output format to mp4 and coded some backward compatibility for older PyTorch without MPS. I changed the "mask LCD" to "object dial", and changed it from a text box to a spin box for better input validation. Can you test that it still works on Mac?

Feel free to push any more fixes to the shortcuts. As for the "undo" function, I left it out because it was too time-consuming to make bug-free. Pull requests for this function are definitely welcomed. I would expect a working undo function to go into each of the interaction types (e.g., we can undo click/scribble/free drawing) and restore states perfectly (i.e., if the user reset the mask with a click interaction and regretted it, they can always go back). The MiVOS version does not do this perfectly which is why I abandoned it here. That would be quite a lot of code changes orthogonal to the current one so it should be a different pull request.

hkchengrex commented 1 year ago

I'm merging because this adds good functionalities with no apparent bugs. Thank you for your contribution.