Open sayakpaul opened 10 months ago
Yes, sure! Feel free to let us know in case of any help.
For starters, I think it might be better to add this to research_projects
similar to ControlNetXS.
We might not be able to add to community
because AnyText has modelling components.
Does this make sense? If we see enough usage, we can include it in the core.
Hi @coding-famer. Have you been able to make progress on this? I'd very much like to be able to use this with diffusers, and would like to help where I can. From the pipeline perspective, I understand most of the code and have made some significant progress. From the modelling perspective, I'm not too sure about what new additions need to be made as I'm still navigating the codebase.
This is a link to the converted AnyText model on huggingface, which might be of help. It took me a very long time (~18 hours) to download from the modelscope hub servers, which I assume are located in China. I'm hoping the conversion to diffusers format was correct. I'm still looking into it, and do not have a full idea, but it seems like there will be different weights used in the clip-encoder based on embedding type here: (but this ocr and vit only seem to be useful for text-editing, which could probably be done sometime in the future; for now, replicating the text-generation part would be great)
Hi, I'm still working on this. Happy to do it together.
Hey, sorry for the late response. I got caught up with other PRs and looking into other interesting work. Would Discord be okay for communication if you're still progressing on this?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Contributions are still welcome.
@sayakpaul can i work on this ?
Sure, we can start with a community pipeline :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
not stale
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
not stale
Can I work on this community pipeline?
Edit 1: I have been busy for several weeks lately because of several personal issues. From now on, I am completely into this. Sorry for holding this pipeline so far.
Edit 2: I largely understood the pipeline. Now, I am trying to convert the checkpoint into diffusers
' format. It has a ControlNet model and several other special components.
Yes, you can. Thank you :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Model/Pipeline/Scheduler description
From the repository:
Open source status
Provide useful links for the implementation
Repository: https://github.com/tyxsspa/AnyText
Paper: https://arxiv.org/abs/2311.03054
Weights and inference code: https://modelscope.cn/models/damo/cv_anytext_text_generation_editing/summary