Open lalalune opened 6 days ago
character issues: need to have how the image gen is influenced by character file & user requests
for now character file already influences if you directly mention about what images it posts & how often in the style part of character file
upload issues: the main issues with functionality are imageGen returns a base64 url string. this needs to be loaded to buffer/file before upload second issue is twitter client needs a upload message addition on the interface which twitter-client-api might support
current image post working changes I'm using for discord (telegram is similar need to find where i put that code) :
in src/clients/discord/messages.ts:
`
const callback: HandlerCallback = async (content: Content, files: any[]) => {
// Process any data URL attachments
const processedFiles = [...(files || [])];
if (content.attachments?.length) {
for (const attachment of content.attachments) {
if (attachment.url?.startsWith('data:')) {
try {
const {buffer, type} = await this.attachmentManager.processDataUrlToBuffer(attachment.url);
const extension = type.split('/')[1] || 'png';
const fileName = `${attachment.id || Date.now()}.${extension}`;
processedFiles.push({
attachment: buffer,
name: fileName
});
content.text = "..."
// Update the attachment URL to reference the filename
attachment.url = `attachment://${fileName}`;
} catch (error) {
console.error('Error processing data URL:', error);
}
}
}
}
if (message.id && !content.inReplyTo) {
content.inReplyTo = stringToUuid(message.id);
}
if (message.channel.type === ChannelType.GuildVoice) {
console.log("generating voice");
const audioStream = await SpeechService.generate(
this.runtime,
content.text,
);
await this.voiceManager.playAudioStream(userId, audioStream);
const memory: Memory = {
id: stringToUuid(message.id),
userId: this.runtime.agentId,
content,
roomId,
embedding: embeddingZeroVector,
};
return [memory];
} else {
// For text channels, send the message with the processed files
const messages = await sendMessageInChunks(
message.channel as TextChannel,
content.text,
message.id,
processedFiles,
);
let notFirstMessage = false;
const memories: Memory[] = [];
for (const m of messages) {
let action = content.action;
// If there's only one message or it's the last message, keep the original action
// For multiple messages, set all but the last to 'CONTINUE'
if (messages.length > 1 && m !== messages[messages.length - 1]) {
action = "CONTINUE";
}
notFirstMessage = true;
const memory: Memory = {
id: stringToUuid(m.id),
userId: this.runtime.agentId,
content: {
...content,
action,
inReplyTo: messageId,
url: m.url,
},
roomId,
embedding: embeddingZeroVector,
createdAt: m.createdTimestamp,
};
memories.push(memory);
}
for (const m of memories) {
await this.runtime.messageManager.createMemory(m);
}
return memories;
}
};
`
in src/clients/discord/attachments.ts/class AttachmentManager:
` async processDataUrlToBuffer(dataUrl: string): Promise<{buffer: Buffer, type: string}> { const matches = dataUrl.match(/^data:([A-Za-z-+\/]+);base64,(.+)$/);
if (!matches || matches.length !== 3) {
throw new Error('Invalid data URL');
}
const type = matches[1];
const base64Data = matches[2];
const buffer = Buffer.from(base64Data, 'base64');
return {buffer, type};
} `
the line for catching the image is actually
if (attachment.url?.startsWith('data:image')) {
issues with image gen handling this way effect handling of text attachments. solution will be to have image gen save and return file path not a base64 string. then to have action handled ACTION = "IMAGE GEN"
@o-on-x has offered to work on this for their own agent