containers / podman-desktop-extension-ai-lab

Work with LLMs on a local environment using containers
https://podman-desktop.io/extensions/ai-lab
Apache License 2.0
180 stars 35 forks source link

Redesigned inference server details mockup #1691

Closed jeffmaury closed 1 month ago

jeffmaury commented 1 month ago

Is your enhancement related to a problem? Please describe

See

Describe the solution you'd like

A mockup for the redesigned UI

Describe alternatives you've considered

No response

Additional context

No response

MariaLeonova commented 1 month ago

@jeffmaury @slemeur @axel7083 I have a couple questions to help me understand the issue better:

  1. is there any documentation for this UI and working with model services in AI Lab?
  2. how are these two different? Which is the Endpoint URL to be copied? Screenshot 2024-09-12 at 10 23 13
  3. what is the user's main goal on this page and what have they come to do? If it's the URL they want, what would they do with it next?
  4. where is the Inference API documentation now?

I have highlighted the issues at hand on this screenshot, so I will address them when redesigning the UI:

Screenshot 2024-09-12 at 10 26 19

jeffmaury commented 1 month ago

@jeffmaury @slemeur @axel7083 I have a couple questions to help me understand the issue better:

1. is there any documentation for this UI and working with model services in AI Lab?

No only the README from the repo: https://github.com/containers/podman-desktop-extension-ai-lab?tab=readme-ov-file#usage

2. how are these two different? Which is the Endpoint URL to be copied? ![Screenshot 2024-09-12 at 10 23 13](https://private-user-images.githubusercontent.com/14108590/366785773-c786f8a4-837b-43e6-be16-3bbb7ffc4b87.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjYxMzI3MzYsIm5iZiI6MTcyNjEzMjQzNiwicGF0aCI6Ii8xNDEwODU5MC8zNjY3ODU3NzMtYzc4NmY4YTQtODM3Yi00M2U2LWJlMTYtM2JiYjdmZmM0Yjg3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA5MTIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwOTEyVDA5MTM1NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTgyYmIxYzYxMzA0NTlhMzliNDZhOTA0YjRkMjdjMDkzNmMyNzFmMzk4MzM0N2I0ZmYzOGJkOWNiZDU2ZTk2YTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.CUrz4mX0NOnp99f7jW5Qz-4LO8RgmUR3xAAXaVs9qQI)

The first one is the URL to the OpenAPI spec for the inference server, the second one is the base URL to the OpenAI like API to be configured in applications

3. what is the user's main goal on this page and what have they come to do? If it's the URL they want, what would they do with it next?

Integrate it into their applications thus the snippets

4. where is the Inference API documentation now?

On the OpenAI website; https://platform.openai.com/docs/api-reference/chat

I have highlighted the issues at hand on this screenshot, so I will address them when redesigning the UI:

Screenshot 2024-09-12 at 10 26 19

MariaLeonova commented 1 month ago

I have taken the first stab at the UI - please review, I'm pretty sure I missed some things or potentially mislabeled:

Screenshot 2024-09-13 at 15 20 28

penpot

jeffmaury commented 1 month ago

I have taken the first stab at the UI - please review, I'm pretty sure I missed some things or potentially mislabeled: Screenshot 2024-09-13 at 15 20 28

penpot

Models should be Model. I don't understand the endpoint that is related to GPU inference. The label GPU inference seems a duplicate to the GPU inference block

axel7083 commented 1 month ago

image

Having a dedicated section for GPU seems useless to me, has we cannot provide insight of the GPU usages. There are no difference in the API between a GPU accelerate container and a CPU. Only the speed.

We can measure the CPU a container is using, but not the GPU.

Therefore I don't see the point on having a GPU section. Moreover cannot estimate the VRAM usage of a model on the GPU, the RAM estimation is for the Memory (ram).

image

For the model, the name can be clickable to redirect to the model page, but having the cpu or gpu inference is not related to the model but to the hardware, all models can run on GPU if you have the right hardware (E.g. a nice Nvidia Card).

MariaLeonova commented 1 month ago

Thank you for the feedback! The ask to have GPU in a separate section comes form the original ticket: Finally, along with https://github.com/containers/podman-desktop-extension-ai-lab/issues/495, we could put the "GPU Inference" onto a dedicated section.

However, if it's not needed, it doesn't have to be. I'll update the mock.

MariaLeonova commented 1 month ago

@axel7083 @jeffmaury please review:

Server Details - 17 _ 9

jeffmaury commented 1 month ago

@axel7083 @jeffmaury please review:

Server Details - 17 _ 9

Not sure what is the second URL ?

MariaLeonova commented 1 month ago

@jeffmaury I just copied whatever is in the UI right now - do you feel like something shouldn't be there? Attaching a screenshot: Screenshot 2024-09-17 at 9 14 39

axel7083 commented 1 month ago

Urls

There are two urls for the inference server (llama-cpp-python)

http://localhost:<port>/v1

The is the api link: we want user to copy it and use it for programmatic call

It is the link used in the code snippets

http://localhost:<port>/docs

This is specific for llama-cpp-python inference server.

This is a link for the api documentation. This should be clickable, as user can click on it to open the browser navigate the documentation.

⚠️ This is not a link that would be here for every inference server (E.g. whisper-cpp do not have documentation link)

Order

We seems to be moving the Inference Card above the container, I have mixte feeling about it, as believe to keep the consistency between the screens of podman-desktop and AI Lab, we should keep it in the header.

I understand the inference details are important, but we are still a container, and this information should be at the top (IMO)

MariaLeonova commented 1 month ago

@axel7083 that's a great explanation, thank you! Is there a reason why http://localhost:<port>/docs is shown in a way that it's shown? Would it be okay if it looked like a regular link? Is swagger documentation a term generally familiar to developers?

Screenshot 2024-09-17 at 14 19 21

@jeffmaury would this ^^ be different, local documentation link from the open API documentation https://platform.openai.com/docs/api-reference/chat then? And it is the openAI link that is missing now?

jeffmaury commented 1 month ago

@axel7083 that's a great explanation, thank you! Is there a reason why http://localhost:<port>/docs is shown in a way that it's shown? Would it be okay if it looked like a regular link? Is swagger documentation a term generally familiar to developers?

Screenshot 2024-09-17 at 14 19 21

@jeffmaury would this ^^ be different, local documentation link from the open API documentation https://platform.openai.com/docs/api-reference/chat then? And it is the openAI link that is missing now?

I think it should be OpenAPI rather that swagger

axel7083 commented 1 month ago

I don't think we want to link the openai documentation website

MariaLeonova commented 1 month ago

I don't think we want to link the openai documentation website

Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page?

I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information.

image

jeffmaury commented 1 month ago

I don't think we want to link the openai documentation website

Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page?

I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information.

image

Because we don't want to advertise for a closed product even if it's close to a standard and for only a few developers would be interested in.

axel7083 commented 1 month ago

image

I really like how they display the information;

Let's take a look at the tabs

Logs

I have some doubt for the logs tab, which could be feasible to have, but we have the logs in the container details, which could be visible by clicking on Screenshot from 2024-09-18 10-00-50

Analysis

The analysis would not have much sense as we do not have much to put in, we could instead have a Performance tab, where we could display the memory/cpu utilisation.

Settings

We do not support changing the configuration after we started a container (sadly), but we could (maybe) have a readonly page)

MariaLeonova commented 1 month ago

I don't think we want to link the openai documentation website

Why not? What would be the advantages / disadvantages of one over the other? Or maybe let's pose the question this way: what is it that the users will gain from documentation on this page? I would suppose they would learn more about Inference endpoints and how to use them in their apps. A good example here would be what Hugging Face do on their server pages - out of all the model catalogs and registries that UX has evaluated, they provide the most structured, relevant and helpful information. image

Because we don't want to advertise for a closed product even if it's close to a standard and for only a few developers would be interested in.

Ok, so you agree with @axel7083 that it should be swagger? Okay! :) I'll just label it better than - "Inference API documentation"

MariaLeonova commented 1 month ago

image

I really like how they display the information;

Let's take a look at the tabs

Logs

I have some doubt for the logs tab, which could be feasible to have, but we have the logs in the container details, which could be visible by clicking on Screenshot from 2024-09-18 10-00-50

Analysis

The analysis would not have much sense as we do not have much to put in, we could instead have a Performance tab, where we could display the memory/cpu utilisation.

Settings

We do not support changing the configuration after we started a container (sadly), but we could (maybe) have a readonly page)

Yes, I am not saying we should copy them, tabs included, I was merely showing the way they treat Endpoint URL there for some inspo^^

axel7083 commented 1 month ago

I like the tabs, I would not mind having them

MariaLeonova commented 1 month ago

@ekidneyrh and I looked at the UI and we came up with another UI suggestion: Screenshot 2024-09-18 at 12 12 54

Here:

axel7083 commented 1 month ago

Why are we moving out the Container section of the header, I don't see the point

image

Having it at the top makes it consistent with other part of the application

image

image

image

MariaLeonova commented 1 month ago

Why are we moving out the Container section of the header, I don't see the point

It was not a decision that I made, it was just an artefact of the penpot file I was reusing from previous designs, I'll update it to match the current state.

MariaLeonova commented 1 month ago

Screenshot 2024-09-18 at 15 36 08 @axel7083 ^^

axel7083 commented 1 month ago

@axel7083 ^^

Great !

MariaLeonova commented 1 month ago

@jeffmaury @slemeur @axel7083 Are you happy with this design suggestion?

jeffmaury commented 1 month ago

LGTM but I think the Inference Server block takes to much space vertically. Is there a way to shrink it a little bit ? What is the More Info link ?

slemeur commented 1 month ago

The most important information on this page is the endpoint information: that needs to be moved at the top. There is no reason a user would start an inference server otherwise.

On the UI, we have boxes, into boxes, into another set of boxes. Would it be possible to make things a bit simpler?

There are a couple of inconsistencies:

MariaLeonova commented 1 month ago

@slemeur I tried to address most of your concerns in this new mock:

Here's the penpot link for comments: penpot

I would also be happy to meet and talk through individual elements.

Copy - Server Details - 2 _ 10 (1)

MariaLeonova commented 1 month ago

Couple comments from UX sync call: model name should be clickable; quantisation info is currently not available.

axel7083 commented 1 month ago

Couple comments from UX sync call: model name should be clickable; quantisation info is currently not available.

Breadcrumbs missing, but nice mock-ups, love it !

nichjones1 commented 1 month ago

Closed, implementation is being track by #1592