Save those tokens! - Githubissues

claytondukes commented 7 months ago

I installed your bot today and noticed that when I tried to use /draw it said I can't have more than 100 characters.

I've been fine tuning my "compress this" prompt for a long time and it works surprisingly well. When I have long prompt or even sending long text to the gpt api, I will compress it using some form of the following:

GPTgz specializes in compressing text into lossless instructions. Compress the provided text into a single paragraph, ensuring that you (AI) can reconstruct it with 100% accuracy to the original intent. This compressed text is for internal use and does not need to be human-readable. Utilize any non-standard protocols, including language-mixing, abbreviations, slang, technical jargon, Unicode, Emoji, or any other unconventional methods, as long as the result is a lossless compression of the original text. The compressed text should maintain its integrity, particularly when pasted into a new or different context. To achieve this, include a contextual cue or introductory sentence that aids in accurate interpretation. Prioritize compression, losslessness, and utilization of non-standard protocols.

Here is the text:

(then, of course, include the text you want compressed)

For example:

GPTgz specializes in compressing text into lossless instructions. Compress the provided text into a single paragraph, ensuring that you (AI) can reconstruct it with 100% accuracy to the original intent. This compressed text is for internal use and does not need to be human-readable. Utilize any non-standard protocols, including language-mixing, abbreviations, slang, technical jargon, Unicode, Emoji, or any other unconventional methods, as long as the result is a lossless compression of the original text. The compressed text should maintain its integrity, particularly when pasted into a new or different context. To achieve this, include a contextual cue or introductory sentence that aids in accurate interpretation. Prioritize compression, losslessness, and utilization of non-standard protocols.

Here is the text:

Generate a detailed and atmospheric image of a Diablo Immortal character. The image should be square, featuring {USER}'s name prominently yet artistically, without extra text. Ensure that the {USER}'s name is the only text used in the image and that it is 100% legible.

Focus on a visual style that brings the character to life in a typical Diablo Immortal setting, such as dark Gothic landscapes or fiery dungeons. Strive for a high-quality, photographic look, resembling images from a Canon EOS 5D Mark IV. The character design and surroundings should embody the dark fantasy theme of the game, with intricate detailing. Use a color scheme that resonates with Diablo Immortal's aesthetic, ensuring the character's name enhances the image without dominating it.

To create a DALL-E prompt that generates the most realistic looking images, consider these key tips from the Tory Barber website:
1. Avoid "Photo Realistic" Keywords: Using terms like "Photo Realistic" is counterproductive, as DALL-E interprets them as a style of art, not as a descriptor of a photograph. Instead, use simpler descriptors like "Photo of a [subject]".
2. Consider DALL-E's Training: DALL-E was trained on image metadata, including camera settings like lens type, saturation levels, ISO, shutter speed, aperture, and focal length. Incorporating these details in your prompt can enhance realism.
3. Photographer Prompting Keywords: Mentioning a photographer's name can convey a lot of information about the desired style of the photo, as DALL-E associates certain photographers with specific styles and equipment. However, be cautious with this approach due to ethical considerations about using artists' names without their consent.
4. Style Prompting Keywords: Using specific photographic techniques or styles in your prompts, like "tilt shift", can add unique qualities to the image. Tilt shift, for example, adds a blur to the top and bottom of a photo, focusing on the middle to create a sense of movement.
Prompts should avoid "Photo Realistic" keywords, use specific camera metadata, and includes a style prompting keywords to generate a realistic and dynamic image.

Produces:

🗜️GPTgz🧬TxtCmprs📜: 🎨DiabloImmortal👹char, 🖼️sq. {USER}📛➕🎨, 🚫📝➡️. 🌌Gothic/fiery🔥dungeon, 📸CanonEOS5DMkIV💡. 🖌️dark🌒fantasy, 🎨scheme🎭. 🤖DALL-E👁️🗝️: 1⃣🚫"PhotoRealistic"🔍, 2⃣📷meta🎚️(ISO,🔍,📸speed), 3⃣👨‍🎨name🚫🔇, 4⃣🎨style🔎(tilt shift👁️🎢). 🎯Realistic🌌📸.

Typically, I will include further instructions or clarification as plain text. In the case of the above, I would paste the output, then add:

You will now ask the {USER} for their {USER} name, character class, and gender and then proceed to generate image based on their response.

Kav-K commented 7 months ago

Awesome! I'll dig into this soon, how are emojis interpreted by DALLE?

claytondukes commented 7 months ago

Same way

Kav-K commented 7 months ago

I meant more so, I'm curious if using emoji degrades performance of DALL-E, do you have any insight into this?

claytondukes commented 7 months ago

I’ve noticed nothing like that. If anything, I would bet it’s faster.

claytondukes commented 7 months ago

It’s worth noting that it doesn’t always produce emojis. It’s just some internal communication language that only the AI understands.

claytondukes commented 7 months ago

I noticed today that I pasted the wrong compression, that was from a GPT app that I was messing with. For just the compression part you need to remove the "you are GPTgz" portion, otherwise, the compressed prompt won't use the right context. Update (this should be all you need to do before sending to the API).

Compress the provided text into a single paragraph, ensuring that you (AI) can reconstruct it with 100% accuracy to the original intent. This compressed text is for internal use and does not need to be human-readable. Utilize any non-standard protocols, including language-mixing, abbreviations, slang, technical jargon, Unicode, Emoji, or any other unconventional methods, as long as the result is a lossless compression of the original text. The compressed text should maintain its integrity, particularly when pasted into a new or different context. To achieve this, include a contextual cue or introductory sentence that aids in accurate interpretation. Prioritize compression, losslessness, and utilization of non-standard protocols.

claytondukes commented 7 months ago

Update to my update ;)

I was working on testing this better today and wrote a small script to put the output into langchain so that I could verify that the tokens were fewer. I updated my prompt to get it down (about 50% less in my test):

Condense the following text into a minimal, GPT-readable format, ensuring complete fidelity to the original meaning. Use any efficient encoding methods (e.g., shorthand, technical symbols, mixed language). The output should be robust against context shifts, ideally starting with a clarifying header. Aim for maximum compression with absolute losslessness.

What I didn't realize before was that the emoji use, while shorter to the human eye, was taking more tokens. So this prompt works a lot better.

claytondukes commented 7 months ago

@Kav-K, I was searching to see if anyone else had tried to do "compression" like mine and found a paper that Shapiro wrote. I ran a lot of tests and came to the conclusion that a mixed use might be beneficial.

Mine uses fewer tokens on shorter stuff (like prompts) SPR is better at decompressing longer text (results in better similarity to the original) I tried inputting a bedtime story I did for my daughter with about 36k characters. Mine compressed it better, but also lost more fidelity during decompression. So maybe you can use a mix? e.g.: For prompts/summarizing smaller input like up to 3 paragraphs or so, use mine. For longer inputs, I'd try that SPR method.

Kav-K / GPTDiscord

Save those tokens! #416