Road to 0.8.0 - Githubissues

Vali-98 commented 1 week ago

NOTE: THIS IS NOT FEATURE REQUEST POST, THERE'S MORE THAN ENOUGH WORK HERE TO BE DONE

Hey all!

I'm planning a big 0.8.0 update for the app, to commemorate a year of development. The goal is to start yanking out a lot of temporary, undercooked or poorly maintained features (just look at startupApp, yikes) with newer solutions as I have learned and evolved as a Javascript/React-Native developer over the last year. Here is the gist of the update:

UI Changes across the board, hopefully to create a unified UI scheme for the app, rather than having every single screen having different design languages.
A redoing of the app's color system which IMO is too monochrome, trying to utilize proper neutral colors for light/dark modes.
Retooling the API system to be extendable rather than implementing bespoke APIs per provider.
Separating Local and Remote modes more distinctly.
Making Local model management easier

Below is a rough idea of planned changes and additions for 0.8.0.

Feel free to reply to ask questions or raise concerns!

Road to 0.8

CHAT

[x] create Chat History sidebar, similar to Settings Drawer
- [x] Needs edit submenu
[x] Update chat fields for modified
[x] ChatList edit submenu
[x] Hold Swipe to copy swipe to next message
[x] Hold Regen to do hard reset on swipe, ignoring continues
[x] Consider if character list should show character description instead of latest chat. Update: its latest chat

CHARACTERS

[x] Search by Name field
[x] New CharacterList menu with merged in and Recent Characters
[x] desc-asc buttons for CharacterList
[x] CharacterList edit submenu
[ ] Character Card Screen rework
- [ ] Allow Renaming of Cards.
- [ ] Show authors note
- [ ] Provide fields for modifying alt greetings and examples
- [ ] Fix scrolling for these text boxes

USER

[x] Consider removal of useUserCard state, only maintain the id portion. This may reduce memory bloat but also creates some inconsistency with how the User card is handled compared to Character cards
[x] User Screen Rework
- [x] Unified User Card selector and editor
- [x] Rename User
- [x] Move User Card to persist, currently using weird MMKV load

LOCAL MODE

[x] Batch Size slider to +16
[x] New Local Model Manager
- [x] Create Sqlite data
- [x] Import routine
- [x] Model Manager Screen

REMOTE MODE

[ ] New API system using json data ala Transformers configs
[ ] API Rework to become preset-like
[ ] Custom API creator to add support for niche APIs

APP MODES

[x] Set up system to swap between LOCAL and REMOTE modes

SAMPLERS

[ ] Move Presets to database, finally

MINOR UI TWEAKS

[ ] Sampler Screen Options rework
[ ] Instruct Screen Options rework

MISC

[ ] Benchmark button
[ ] TagSearch
[ ] Scenario inclusion in prompt building
[ ] Show Avatar view to see character portraits
[ ] Add a loader state + modal when loading in multiple cards

STYLE UNIFICATION

[ ] There has to be some kind of standard design language throughout the app for:
- [ ] Buttons
- [ ] Dropdowns
- [ ] Selector Menus
- [ ] Headers and Content
[ ] These are then uniformly applied using predefined:
- [ ] Colors
  - [ ] New Monotone color system
- [ ] Text Sizes
- [ ] Components
  - [ ] Text + Desc
  - [ ] toggle + Desc
  - [ ] Input Fields
  - [ ] Alerts

REJECTED PILE

Here are changes which were tested but were either too difficult or non-viable.

Chat

[ ] useLiveQuery on Chat - difficult to do and will inevitably delay generation times due to prompt building
[ ] sliding window for chat inputs - related to above

Local Mode:

Download in-app is removed due to download handling being too complex. Could be revisisted in the future

[ ] Download routine
[ ] useDownloader hook for download UI
[ ] HF browser (Likely not being added due to limitations of HF API)

inspir3dArt commented 1 week ago

Hi,

I highly appreciate your hard work on this app, it's a really promising project.

From a user point of view there are three things that prevent me currently from using it frequently.

The character card limitation. Most character cards use multiple fields like character, scenario, system prompt, jailbreak, example dialogues, first message. I made modified versions of some of my favorites that puts everything in the character and first massage fields, but most card's you download don't work out of the box correctly without modifying them. Koboldcpp has this implemented quite simple, they just put the text from all the fields, except the first massage, together in a certain order, and keep it in context, that works quite good.
The rerolls of the last massage doesn't work. It's unfortunately a real deal breaker for roleplay, especially in the first replies I usually have to reroll a few times to get the roleplay getting in the direction I want.
It would be nice if it would be possible to link to a model you have on your devices storage, instead of copying it to the internal app storage. I have quite a few models on my device and storage is limited. I know it would be not allowed for a version to release on playstore, but otherwise it's technically possible.

Please don't see this as some sort of demands, I know it is a hobby project and I highly appreciate you share your work. I'm quite impressed by how it performs, it's faster than using koboldcpp locally in termux, and will be much more convenient to use. And I fully understand that it can be more interesting to rewrite and improve more interesting part's with what you are learning on the way, and I hope you have a lot of fun doing it.

I guess from a user point of view it would be great to have these things fixed, before waiting for a big reinventing of the wheel, especially since it seems so close to being a really useful app. But I understand it's your project, and I hope you're enjoying your journey of learning by doing, without letting get people's opinions or requests getting under your skin or in your way. I know it can feel sometimes like getting pressure for sharing your hard work for free. I hope you manage to maintain your passion and motivation, and look forward to see what you will make out of this interesting project.

Have a nice day.

Vali-98 commented 1 week ago

Hey there @inspir3dArt , thanks for the general feedback, I do have some things to share about your thoughts:

The character card limitation

This is one thing I'm aiming to fix with the Character Card Editor refresh. Currently, only description and examples are considered, but some websites use fields such as personality and scenario in context. These fields actually already exist and are stored in the app. It's really just a matter of implementing them to respect the tokenizer system and not break the prompt builder.

For the UI side of things, the current card editor hasn't been changed in a very long time design wise. It's about time it gets a facelift.

The rerolls of the last massage doesn't work.

This is seems like a model issue. What a reroll does is essentially just make the same request but with a different seed value. I'm fairly sure all UI's do this. If possible, perhaps increasing Temperature or using samplers like XTC would improve variability.

It would be nice if it would be possible to link to a model you have on your devices storage

This is something I have actually tried before, but it always ended up crashing. I suppose I could revisit this to find a solution. Personally speaking, it'd be nice to also not copy model files over and over when I'm testing fresh installs.

EDIT: Update, the primary issue with loading models from storage is that Android sandboxes application files for security reasons and disallows reading file paths directly, so this sadly just isn't possible on the JS side, but I'll see if I can squeeze it into the native side.

GeminiX369 commented 5 days ago

I have a small suggestion. Can the software send a part of text to TTS for synthesis when it generates it? In this way, there is no need to wait for a large section of text to be completely synthesized before performing tts speech synthesis all at once. This can significantly improve the user experience when the model output text is of large length. Punctuation such as commas and periods can be used as delimiters between text fragments.

Vali-98 commented 5 days ago

@GeminiX369 I have had this feature request before, however at the moment it will require a somewhat big rework of the TTS system to actually execute, so no promises there.

@inspir3dArt I have tested a possible solution to external model loading, I think this may be possible to ship.

Issues-maker commented 4 days ago

Would you add in your roadmap a new feature - 'set specific time to keep model loaded'? It is already implemented in competitor project Ollama app(https://github.com/JHubi1/ollama-app) if you need the code example for it.

This feature might be very useful for most users of your app because if me, I prefer keeping a model in memory constantly so I can communicate by phone with LLM in just a second.

Telegram also has big community of fans of your app there, feel free to join, there are some questions too https://t.me/chatterui

Vali-98 commented 4 days ago

Would you add in your roadmap a new feature - 'set specific time to keep model loaded'? It is already implemented in competitor project Ollama app(https://github.com/JHubi1/ollama-app) if you need the code example for it.

The link given is a to a UI that connects to ollama. Ollama itself is the model manager here. Though it isn't impossible to create some kind of llm background service for android which loads as needed, it is out of the scope of the project.

Issues-maker commented 4 days ago

Would you add in your roadmap a new feature - 'set specific time to keep model loaded'? It is already implemented in competitor project Ollama app(https://github.com/JHubi1/ollama-app) if you need the code example for it.

The link given is a to a UI that connects to ollama. Ollama itself is the model manager here. Though it isn't impossible to create some kind of llm background service for android which loads as needed, it is out of the scope of the project.

Android? It changes time to keep model loaded on gpu server, at least it is what I use - 70B model loaded on my gpu server. Personally for me, all these attempts to run LLMs inside phone memory is joke. Anyway if Ollama, it needs parameter "keep_alive = 0"(or negative for keep a model in memory permanently), I don't know mobile app programming that's why I gave example in another app for you.

GeminiX369 commented 4 days ago

@GeminiX369 I have had this feature request before, however at the moment it will require a somewhat big rework of the TTS system to actually execute, so no promises there.

@inspir3dArt I have tested a possible solution to external model loading, I think this may be possible to ship.

It's a pity that I don't know react-native, otherwise I would be happy to help. This is not difficult to achieve, you only need to remember the end of the current sentence being read aloud, and the end of the next sentence to be read aloud (that is, the position of the latest punctuation mark). After the current sentence is read, the next sentence to be read is sent. At the same time, when receiving content, the end of the next sentence to be read (the position of the latest punctuation mark) is updated in real time.

GeminiX369 commented 4 days ago

Would you add in your roadmap a new feature - 'set specific time to keep model loaded'? It is already implemented in competitor project Ollama app(https://github.com/JHubi1/ollama-app) if you need the code example for it.

The link given is a to a UI that connects to ollama. Ollama itself is the model manager here. Though it isn't impossible to create some kind of llm background service for android which loads as needed, it is out of the scope of the project.

Android? It changes time to keep model loaded on gpu server, at least it is what I use - 70B model loaded on my gpu server. Personally for me, all these attempts to run LLMs inside phone memory is joke. Anyway if Ollama, it needs parameter "keep_alive = 0"(or negative for keep a model in memory permanently), I don't know mobile app programming that's why I gave example in another app for you.

I think it's meaningful. Although the performance of LLM on the mobile phone is not as good as that on the server, it is still capable of using it to perform some simple text processing tasks. For example, you can use it to summarize text, remove some nonsense, or let it extract some information you want from the text, etc. It would be better if an API could be provided for other applications to use. Although I know it can be run directly in Termux, but that is not very convenient.

Vali-98 commented 4 days ago

Anyway if Ollama, it needs parameter "keep_alive = 0"(or negative for keep a model in memory permanently)

Ah, I see, I thought this was a request for on-device model management. As for that ollama, its just a generation parameter field, it can be done pretty easily based on the docs here: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion

That said, this will likely be a part of the API-generalization rework.

GameOverFlowChart commented 3 days ago

A redoing of the app's color system which IMO is too monochrome, trying to utilize proper neutral colors for light/dark modes.

Just make sure to support true dark mode for amoled screens, it saves battery and less active pixels would hopefully help with screen burn issues from happening.

Vali-98 / ChatterUI

Road to 0.8.0 #89