epfl-cs358 / 2024sp-helping-hand

2 stars 0 forks source link

Research the semantic segmentation from facebook #43

Closed malenraychev closed 1 month ago

malenraychev commented 2 months ago
gruvw commented 2 months ago

@nourguermazi @violoncelloCH

violoncelloCH commented 1 month ago

I've started some work here: https://github.com/epfl-cs358/2024sp-helping-hand/pull/46

However there is some general discussion to be had about the feasibility of this approach in the scope of our project. The facebook SAM consumes a lot of RAM (and CPU resources) while running. Even the smallest of the available models takes more than 10GB of RAM during image analysis. Furthermore, the process of analyzing a single picture takes multiple minutes on a CPU. Since the scope for this is to be run on an individuals computer, therefore most likely without a beefy GPU that could be used, I don't think it makes sense to pursue this approach; at least not using this particular heavy model. It simply makes no sense to wait minutes on a process to complete when configuring the same by hand takes actually less time and guarantees one to have a correct result while the automated analysis might still need manual adjustment.