Open abrichr opened 11 months ago
@FFFiend thoughts? 🙏 😄
I took a look at the client.py
code as well as the docs so from my understanding:
Instead of using a gradio app url or HuggingFace space, we would like to create an entry point to the EC2 SoM instance we have available and return a marked screenshot as the output of predict
Bit confused on how a marked screenshot is defined though 😄
@FFFiend thanks for your quick response!
Instead of using a gradio app url or HuggingFace space, we would like to create an entry point to the EC2 SoM instance we have available and return a marked screenshot as the output of predict
Exactly right! We would need to integrate the deploy.py
and a variation of client.py
, which would both be called from elsewhere in openadapt (e.g. visualize.py, replay.py).
Bit confused on how a marked screenshot is defined though 😄
No worries! You can see the marked screenshot in the PR description, reproduced here:
The original screenshot is on the left, the marked screenshot is on the right.
awesome, so for client.py I'm envisioning client to work as follows:
start
and stop
from the Deploy
class for instantiating and then closing the instance.demo_som.py
for example) into a function and then plug that into the runner above, and inference is done.The original repo doesn't have any architecture or heavy ML code laid out so I'm guessing the meat n potatoes is within the demo files, but I could be wrong.
@FFFiend thanks for your patience! Just saw this 😅
- Use start and stop from the Deploy class for instantiating and then closing the instance.
Agreed.
- Use either paramiko https://www.paramiko.org/ or https://pexpect.readthedocs.io/en/stable/ to have a runner for functions within the instance.
This may be unnecessary. https://github.com/microsoft/SoM includes a client.py
which uses the Gradio API -- the client.py
should look similar.
- Write up or reuse SOM logic from one of the existing demos (demo_som.py for example) into a function and then plug that into the runner above, and inference is done.
Bingo! This should all go in client.py
.
Feature request
To support https://github.com/openadaptai/SoM we need to implement a client.py with https://www.gradio.app/docs/client. See:
Motivation
https://github.com/openadaptai/SoM is state-of-the-art for visual understanding, and only runs on Linux / CUDA
Refer to system diagram:
Inference (SoM/SAM) must be done remotely.
We wish to implement:
openadapt/adapters/som/client.py
: modified version of client.py in https://github.com/microsoft/SoM/pull/19 to support getting marked screenshots during analysis (visualization) and replayopenadapt/adapters/som/server
, which can be a git submodule containing https://github.com/OpenAdaptAI/SoM/