hoarder-app / hoarder

A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search
https://hoarder.app
GNU Affero General Public License v3.0
6.63k stars 240 forks source link

Enhance flexibility for accepted tag format #318

Closed Jason-233 closed 4 months ago

Jason-233 commented 4 months ago

im currently using qwen1.5-14b-chat-awq and glm-4v for inference. i got these three responses in a row for a picture inference with glm-4v, and i wondered if it was possible that hoarder would take these responses too. sometimes, the same response happened on qwen1.5-14b-chat-awq. i really appreciate your work. below are the logs of the responses.

hoarder-workers-1      | 2024-07-21T15:12:39.893Z error: [inference][1612] inference job failed: Error: [inference][1612] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json
hoarder-workers-1      | {
hoarder-workers-1      |   "tags": 
hoarder-workers-1      | 2024-07-21T15:12:40.435Z info: [inference][1612] Starting an inference job for bookmark with id "iwrg9lg8vf7te4sf8cvf4z28"
hoarder-workers-1      | 2024-07-21T15:12:47.826Z error: [inference][1612] inference job failed: Error: [inference][1612] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json
hoarder-workers-1      | {
hoarder-workers-1      |   "tags": 
hoarder-workers-1      | 2024-07-21T15:12:48.870Z info: [inference][1612] Starting an inference job for bookmark with id "iwrg9lg8vf7te4sf8cvf4z28"
hoarder-workers-1      | 2024-07-21T15:12:53.630Z error: [inference][1612] inference job failed: Error: [inference][1612] The model ignored our prompt and didn't respond with the expected JSON: {}. Here's a sneak peak from the response: ```json
hoarder-workers-1      | {
hoarder-workers-1      |   "tags": 

Here's a sneak peak from the response: ```json hoarder-workers-1 | { hoarder-workers-1 | "tags":

kamtschatka commented 4 months ago

I don't understand what you mean? There is no data in this response, what would you expect to use here?

Jason-233 commented 4 months ago

Sorry, i thought that the reponse started with ```json , so it was ignored as not being the appropriate format. then i took a further look at the response. that cannot be resolved, right?

{ "tags": [ "天际未来科技有限公司", "Top 10客户名单", "排名", "排名变化", "企业名称", "市值(亿人民币)", "价值变化", "城市", "行业", "Infinity Innovations Inc.", "13800", "-3000", "洛杉矶", "金融科技" ], ... // Additional tags based on the content of the image }
MohamedBassem commented 4 months ago

@Jason-233 In the prompt for images, we have Don't wrap the response in a markdown code. which I added explicitly to ask the model to not add this wrapping. I think we can probably add it in the prompt for text as well.

Jason-233 commented 4 months ago

@Jason-233 In the prompt for images, we have Don't wrap the response in a markdown code. which I added explicitly to ask the model to not add this wrapping. I think we can probably add it in the prompt for text as well.

@MohamedBassem thanks for your explaining. glm-4v,vision model i used for picture inference, just kept adding this wrapping. so that means i need a new vision model, right? or extract json from the markdown code?

MohamedBassem commented 4 months ago

@Jason-233 yeah, if the model is ignoring the prompt, I'm not a big fan of explicitly handling the markdown to be honest. Can you try llava for example for the picture inference?