posterllava / PosterLLaVA

Other
60 stars 2 forks source link

Automatic Text-to-poster pipeline PosterGen 打算什么时候release #7

Open hitlxm opened 1 month ago

hitlxm commented 1 month ago

如上

posterllava commented 1 month ago

Thanks for your attention! We will update the paper on arXiv before Nov. 2024 —and possibly the same for the tutorial code.

hitlxm commented 1 month ago

Thanks for your attention! We will update the paper on arXiv before Nov. 2024 —and possibly the same for the tutorial code.

另外,现在并不支持c+s -> p,你们有any plan吗?

posterllava commented 1 month ago

我们目前的settings是与CGL和posterlayout对齐,但我们的方法本质上兼容c->s+p, c+s->p甚至none->c+s+p,这只需要针对你的需求对输入数据做不同的mask,我们默认是mask掉输入中的size和position(作为输出),保留category(作为输入)。按需求修改data/qbposter/get_prompt.py中的procss_json函数即可。

hitlxm commented 1 month ago

我们目前的settings是与CGL和posterlayout对齐,但我们的方法本质上兼容c->s+p, c+s->p甚至none->c+s+p,这只需要针对你的需求对输入数据做不同的mask,我们默认是mask掉输入中的size和position(作为输出),保留category(作为输入)。按需求修改data/qbposter/get_prompt.py中的procss_json函数即可。

所以prompt是这样吗? "value": "\nHello! Could you please help me to place 4 foreground elements over the background image of resolution [768, 768] to craft an aesthetically pleasing, harmonious, balanced, and visually appealing commercial poster?\nFinding semantic-meaningful objects or visual foci on the background image at first might help in designing, and you should avoid any unnecessary blocking of them. logo needs to be placed at the top of the background image. \nFor each layout, there are 3 additional user requirements and you are expected to generate a layout corresponding to them. Here is the user requirements: The layout should ensure that there is clear space around both text elements. text elements should be aligned vertically with consistent horizontal margins.\nPlease return the result by completing the following JSON file. Each element's location and size should be represented by a bounding box described as [left, top, width, height], and each number is a continuous digit from 0 to 1.\nHere is the initial JSON file: [{'label': 'text', 'box': [None,None,100,30]}, {'label': 'text', 'box': [None,None,100,30]}, {'label': 'text', 'box': [None,None,100,30]}, {'label': 'logo', 'box': [None,None,30,30]}]\n"

hitlxm commented 1 month ago

模版里面是[left, top, right, bottom], 换成[left, top, width, height],之后,输出结果就看着比较奇怪,貌似输出结果还是[left, top, right, bottom]这种模式? [left, top, right, bottom] + c -> s +p 模式 sg-11134249-7rdvc-lzpusqheht6v79_ori_vis_with_uc

[left, top, width, height] + c + s-> p 模式 prompt: "value": "\nHello! Could you please help me to place 4 foreground elements over the background image of resolution [768, 768] to craft an aesthetically pleasing, harmonious, balanced, and visually appealing commercial poster?\nFinding semantic-meaningful objects or visual foci on the background image at first might help in designing, and you should avoid any unnecessary blocking of them. logo needs to be placed at the top of the background image. \nFor each layout, there are 3 additional user requirements and you are expected to generate a layout corresponding to them. Here is the user requirements: The layout should ensure that there is clear space around both text elements. text elements should be aligned vertically with consistent horizontal margins.\nPlease return the result by completing the following JSON file. Each element's location and size should be represented by a bounding box described as [left, top, width, height], and each number is a continuous digit from 0 to 1.\nHere is the initial JSON file: [{'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'text', 'box': [None, None, 0.13020833333333334, 0.0390625]}, {'label': 'logo', 'box': [None, None, 0.0390625, 0.0390625]}]\n" SeaTalk_IMG_20241012_101524