Open s3219521aa opened 2 days ago
Fine-tuning with open source projects?I don't quite understand.
The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.
Fine-tuning with open source projects?I don't quite understand.
The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.
In the context of the current task-specific fine-tuning, we still cannot understand the contribution of the authors' work.
Fine-tuning with open source projects?I don't quite understand.
The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.
we can also train a lora to craft a custom story with a specific man.
Fine-tuning with open source projects?I don't quite understand.
The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.
we can also train a lora to craft a custom story with a specific man.
Thank you for your question. This work is an experimental attempt to validate whether a wide range of controllable generation tasks—such as photo retouching, visual effects, identity preservation, visual identity surroundings, font transfer, and more—can be unified under a simple paradigm. The core idea is to concatenate both the condition and target images into a single composite image and then use natural language to define the task.
Our goal is to develop a task-agnostic framework that generalizes across various generation tasks. While we’ve made progress, this is still an ongoing effort, and we’re actively working on refining and extending the approach. Feedback and suggestions are always welcome!
Fine-tuning with open source projects?I don't quite understand.