ali-vilab / In-Context-LoRA

Official repository of In-Context LoRA for Diffusion Transformers
1.2k stars 57 forks source link

What is your job content? #5

Closed s3219521aa closed 2 weeks ago

s3219521aa commented 3 weeks ago

Fine-tuning with open source projects?I don't quite understand.

huanglianghua commented 3 weeks ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

JoshonSmith commented 3 weeks ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

In the context of the current task-specific fine-tuning, we still cannot understand the contribution of the authors' work.

JoshonSmith commented 3 weeks ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

we can also train a lora to craft a custom story with a specific man.

huanglianghua commented 3 weeks ago

Fine-tuning with open source projects?I don't quite understand.

The In-Context LoRA can roughly be seen as a second phase of pretraining for text-to-image generation (although it’s currently still task-specific tuning). Just as SFT for large language models doesn’t require new code, only new data, in-context LoRA also doesn’t require new code, just new data.

we can also train a lora to craft a custom story with a specific man.

Thank you for your question. This work is an experimental attempt to validate whether a wide range of controllable generation tasks—such as photo retouching, visual effects, identity preservation, visual identity surroundings, font transfer, and more—can be unified under a simple paradigm. The core idea is to concatenate both the condition and target images into a single composite image and then use natural language to define the task.

Our goal is to develop a task-agnostic framework that generalizes across various generation tasks. While we’ve made progress, this is still an ongoing effort, and we’re actively working on refining and extending the approach. Feedback and suggestions are always welcome!

akk-123 commented 3 weeks ago

@huanglianghua You gave the weights of 10 models. Can you give the training data of these models?

huanglianghua commented 3 weeks ago

@huanglianghua You gave the weights of 10 models. Can you give the training data of these models?

Thank you for your interest, we currently do not have plans to open source the training data of these models.