Make image generation very nice and spicy

lalalune commented 6 days ago

@o-on-x has offered to work on this for their own agent

o-on-x commented 5 days ago

character issues: need to have how the image gen is influenced by character file & user requests

for now character file already influences if you directly mention about what images it posts & how often in the style part of character file

upload issues: the main issues with functionality are imageGen returns a base64 url string. this needs to be loaded to buffer/file before upload second issue is twitter client needs a upload message addition on the interface which twitter-client-api might support

current image post working changes I'm using for discord (telegram is similar need to find where i put that code) :
in src/clients/discord/messages.ts:

`

 const callback: HandlerCallback = async (content: Content, files: any[]) => {

    // Process any data URL attachments
    const processedFiles = [...(files || [])];

    if (content.attachments?.length) {
      for (const attachment of content.attachments) {
        if (attachment.url?.startsWith('data:')) {
          try {
            const {buffer, type} = await this.attachmentManager.processDataUrlToBuffer(attachment.url);
            const extension = type.split('/')[1] || 'png';
            const fileName = `${attachment.id || Date.now()}.${extension}`;

            processedFiles.push({
              attachment: buffer,
              name: fileName
            });
            content.text = "..."
            // Update the attachment URL to reference the filename
            attachment.url = `attachment://${fileName}`;
          } catch (error) {
            console.error('Error processing data URL:', error);
          }
        }
      }
    }

    if (message.id && !content.inReplyTo) {
      content.inReplyTo = stringToUuid(message.id);
    }

    if (message.channel.type === ChannelType.GuildVoice) {
      console.log("generating voice");
      const audioStream = await SpeechService.generate(
        this.runtime,
        content.text,
      );
      await this.voiceManager.playAudioStream(userId, audioStream);
      const memory: Memory = {
        id: stringToUuid(message.id),
        userId: this.runtime.agentId,
        content,
        roomId,
        embedding: embeddingZeroVector,
      };
      return [memory];
    } else {
      // For text channels, send the message with the processed files
      const messages = await sendMessageInChunks(
        message.channel as TextChannel,
        content.text,
        message.id,
        processedFiles,
      );
      let notFirstMessage = false;
      const memories: Memory[] = [];
      for (const m of messages) {
        let action = content.action;
        // If there's only one message or it's the last message, keep the original action
        // For multiple messages, set all but the last to 'CONTINUE'
        if (messages.length > 1 && m !== messages[messages.length - 1]) {
          action = "CONTINUE";
        }

        notFirstMessage = true;
        const memory: Memory = {
          id: stringToUuid(m.id),
          userId: this.runtime.agentId,
          content: {
            ...content,
            action,
            inReplyTo: messageId,
            url: m.url,
          },
          roomId,
          embedding: embeddingZeroVector,
          createdAt: m.createdTimestamp,
        };
        memories.push(memory);
      }
      for (const m of memories) {
        await this.runtime.messageManager.createMemory(m);
      }
      return memories;
    }
  };

`

in src/clients/discord/attachments.ts/class AttachmentManager:

` async processDataUrlToBuffer(dataUrl: string): Promise<{buffer: Buffer, type: string}> { const matches = dataUrl.match(/^data:([A-Za-z-+\/]+);base64,(.+)$/);

if (!matches || matches.length !== 3) {
  throw new Error('Invalid data URL');
}

const type = matches[1];
const base64Data = matches[2];
const buffer = Buffer.from(base64Data, 'base64');

return {buffer, type};

} `

o-on-x commented 5 days ago

the line for catching the image is actually if (attachment.url?.startsWith('data:image')) {

o-on-x commented 4 days ago

issues with image gen handling this way effect handling of text attachments. solution will be to have image gen save and return file path not a base64 string. then to have action handled ACTION = "IMAGE GEN"

ai16z / eliza

Make image generation very nice and spicy #158