ali-vilab / In-Context-LoRA

Official repository of In-Context LoRA for Diffusion Transformers
339 stars 11 forks source link

What is your job content? #5

Open s3219521aa opened 2 days ago

s3219521aa commented 2 days ago

Fine-tuning with open source projects?I don't quite understand.

huanglianghua commented 1 day ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

JoshonSmith commented 8 hours ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

In the context of the current task-specific fine-tuning, we still cannot understand the contribution of the authors' work.

JoshonSmith commented 8 hours ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

we can also train a lora to craft a custom story with a specific man.

huanglianghua commented 5 hours ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

we can also train a lora to craft a custom story with a specific man.

Thank you for your question. This work is an experimental attempt to validate whether a wide range of controllable generation tasks—such as photo retouching, visual effects, identity preservation, visual identity surroundings, font transfer, and more—can be unified under a simple paradigm. The core idea is to concatenate both the condition and target images into a single composite image and then use natural language to define the task.

Our goal is to develop a task-agnostic framework that generalizes across various generation tasks. While we’ve made progress, this is still an ongoing effort, and we’re actively working on refining and extending the approach. Feedback and suggestions are always welcome!