Closed jeffmaury closed 1 month ago
@jeffmaury @slemeur @axel7083 I have a couple questions to help me understand the issue better:
I have highlighted the issues at hand on this screenshot, so I will address them when redesigning the UI:
@jeffmaury @slemeur @axel7083 I have a couple questions to help me understand the issue better:
1. is there any documentation for this UI and working with model services in AI Lab?
No only the README from the repo: https://github.com/containers/podman-desktop-extension-ai-lab?tab=readme-ov-file#usage
2. how are these two different? Which is the Endpoint URL to be copied? ![Screenshot 2024-09-12 at 10 23 13](https://private-user-images.githubusercontent.com/14108590/366785773-c786f8a4-837b-43e6-be16-3bbb7ffc4b87.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjYxMzI3MzYsIm5iZiI6MTcyNjEzMjQzNiwicGF0aCI6Ii8xNDEwODU5MC8zNjY3ODU3NzMtYzc4NmY4YTQtODM3Yi00M2U2LWJlMTYtM2JiYjdmZmM0Yjg3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA5MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwOTEyVDA5MTM1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTgyYmIxYzYxMzA0NTlhMzliNDZhOTA0YjRkMjdjMDkzNmMyNzFmMzk4MzM0N2I0ZmYzOGJkOWNiZDU2ZTk2YTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.CUrz4mX0NOnp99f7jW5Qz-4LO8RgmUR3xAAXaVs9qQI)
The first one is the URL to the OpenAPI spec for the inference server, the second one is the base URL to the OpenAI like API to be configured in applications
3. what is the user's main goal on this page and what have they come to do? If it's the URL they want, what would they do with it next?
Integrate it into their applications thus the snippets
4. where is the Inference API documentation now?
On the OpenAI website; https://platform.openai.com/docs/api-reference/chat
I have highlighted the issues at hand on this screenshot, so I will address them when redesigning the UI:
I have taken the first stab at the UI - please review, I'm pretty sure I missed some things or potentially mislabeled:
I have taken the first stab at the UI - please review, I'm pretty sure I missed some things or potentially mislabeled:
Models should be Model. I don't understand the endpoint that is related to GPU inference. The label GPU inference seems a duplicate to the GPU inference block
Having a dedicated section for GPU seems useless to me, has we cannot provide insight of the GPU usages. There are no difference in the API between a GPU accelerate container and a CPU. Only the speed.
We can measure the CPU a container is using, but not the GPU.
Therefore I don't see the point on having a GPU section. Moreover cannot estimate the VRAM usage of a model on the GPU, the RAM estimation is for the Memory (ram).
For the model, the name can be clickable to redirect to the model page, but having the cpu
or gpu
inference is not related to the model but to the hardware, all models can run on GPU if you have the right hardware (E.g. a nice Nvidia Card).
Thank you for the feedback! The ask to have GPU in a separate section comes form the original ticket: Finally, along with https://github.com/containers/podman-desktop-extension-ai-lab/issues/495, we could put the "GPU Inference" onto a dedicated section.
However, if it's not needed, it doesn't have to be. I'll update the mock.
@axel7083 @jeffmaury please review:
@axel7083 @jeffmaury please review:
Not sure what is the second URL ?
@jeffmaury I just copied whatever is in the UI right now - do you feel like something shouldn't be there? Attaching a screenshot:
There are two urls for the inference server (llama-cpp-python)
http://localhost:<port>/v1
The is the api link: we want user to copy it and use it for programmatic call
It is the link used in the code snippets
http://localhost:<port>/docs
This is specific for llama-cpp-python inference server.
This is a link for the api documentation. This should be clickable, as user can click on it to open the browser navigate the documentation.
⚠️ This is not a link that would be here for every inference server (E.g. whisper-cpp do not have documentation link)
We seems to be moving the Inference Card above the container, I have mixte feeling about it, as believe to keep the consistency between the screens of podman-desktop and AI Lab, we should keep it in the header.
I understand the inference details are important, but we are still a container, and this information should be at the top (IMO)
@axel7083 that's a great explanation, thank you! Is there a reason why http://localhost:<port>/docs
is shown in a way that it's shown? Would it be okay if it looked like a regular link? Is swagger documentation
a term generally familiar to developers?
@jeffmaury would this ^^ be different, local documentation link from the open API documentation https://platform.openai.com/docs/api-reference/chat then? And it is the openAI link that is missing now?
@axel7083 that's a great explanation, thank you! Is there a reason why
http://localhost:<port>/docs
is shown in a way that it's shown? Would it be okay if it looked like a regular link? Isswagger documentation
a term generally familiar to developers?@jeffmaury would this ^^ be different, local documentation link from the open API documentation https://platform.openai.com/docs/api-reference/chat then? And it is the openAI link that is missing now?
I think it should be OpenAPI rather that swagger
I don't think we want to link the openai documentation website
I don't think we want to link the openai documentation website
Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page?
I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information.
I don't think we want to link the openai documentation website
Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page?
I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information.
Because we don't want to advertise for a closed product even if it's close to a standard and for only a few developers would be interested in.
I really like how they display the information;
Let's take a look at the tabs
Logs
I have some doubt for the logs tab, which could be feasible to have, but we have the logs in the container details, which could be visible by clicking on
Analysis
The analysis would not have much sense as we do not have much to put in, we could instead have a Performance
tab, where we could display the memory/cpu utilisation.
Settings
We do not support changing the configuration after we started a container (sadly), but we could (maybe) have a readonly page)
I don't think we want to link the openai documentation website
Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page? I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information.
Because we don't want to advertise for a closed product even if it's close to a standard and for only a few developers would be interested in.
Ok, so you agree with @axel7083 that it should be swagger? Okay! :) I'll just label it better than - "Inference API documentation"
I really like how they display the information;
Let's take a look at the tabs
Logs
I have some doubt for the logs tab, which could be feasible to have, but we have the logs in the container details, which could be visible by clicking on
Analysis
The analysis would not have much sense as we do not have much to put in, we could instead have a
Performance
tab, where we could display the memory/cpu utilisation.
Settings
We do not support changing the configuration after we started a container (sadly), but we could (maybe) have a readonly page)
Yes, I am not saying we should copy them, tabs included, I was merely showing the way they treat Endpoint URL there for some inspo^^
I like the tabs, I would not mind having them
@ekidneyrh and I looked at the UI and we came up with another UI suggestion:
Here:
Why are we moving out the Container section of the header, I don't see the point
Having it at the top makes it consistent with other part of the application
Why are we moving out the Container section of the header, I don't see the point
It was not a decision that I made, it was just an artefact of the penpot file I was reusing from previous designs, I'll update it to match the current state.
@axel7083 ^^
@axel7083 ^^
Great !
@jeffmaury @slemeur @axel7083 Are you happy with this design suggestion?
LGTM but I think the Inference Server block takes to much space vertically. Is there a way to shrink it a little bit ? What is the More Info link ?
The most important information on this page is the endpoint information: that needs to be moved at the top. There is no reason a user would start an inference server otherwise.
On the UI, we have boxes, into boxes, into another set of boxes. Would it be possible to make things a bit simpler?
There are a couple of inconsistencies:
@slemeur I tried to address most of your concerns in this new mock:
Here's the penpot link for comments: penpot
I would also be happy to meet and talk through individual elements.
Couple comments from UX sync call: model name should be clickable; quantisation info is currently not available.
Couple comments from UX sync call: model name should be clickable; quantisation info is currently not available.
Breadcrumbs missing, but nice mock-ups, love it !
Closed, implementation is being track by #1592
Is your enhancement related to a problem? Please describe
See
Describe the solution you'd like
A mockup for the redesigned UI
Describe alternatives you've considered
No response
Additional context
No response