SignLLM / Prompt2Sign

The Data and Code of Prompt2Sign: The First Comprehensive Multilingual Sign Language Dataset.
https://signllm.github.io/Prompt2Sign/
Other
113 stars 6 forks source link

When will the complete code be released? #1

Closed caochengchen closed 1 month ago

SignLLM commented 1 month ago

@caochengchen I'm glad you're interested in our work.

But to be honest, don't worry, because I'm a bit busy. If it weren't for a new sign language recognition job with a name similar to ours, I wouldn't have released this sign language production job to arXiv.

The original plan was to wait at least a few more months before being released. If we are ready, I will @ you. Thank you again.

caochengchen commented 1 month ago

@caochengchen I'm glad you're interested in our work.我很高兴你对我们的工作感兴趣。

But to be honest, don't worry, because I'm a bit busy. If it weren't for a new sign language recognition job with a name similar to ours, I wouldn't have released this sign language production job to arXiv.但说实话,别担心,因为我有点忙。如果不是有一个名字和我们相似的新手语识别工作,我就不会把这个手语制作工作发布到arXiv。

The original plan was to wait at least a few more months before being released. If we are ready, I will @ you. Thank you again.最初的计划是至少再等几个月才能发布。如果我们准备好了,我会@你。再次感谢你。

Okay, looking forward to your reply. Thank you for your contribution.

SignLLM commented 1 month ago

@caochengchen I think what you're saying makes sense. Being tall, short, fat, or thin can indeed affect some performance. After normalization, we tried to minimize the impact by resetting the standard origin. However, some people have longer arms, while others have shorter arms, which can have a slight impact.

For example, it's like there are people who are 1.7 meters tall, but some have longer legs than another person. This impact is a bit small. In previous work, there were many different sign language demonstrators, and in fact, their body proportions were also different. I think our data, after normalization, has at least a slightly smaller error compared to the original video.

We can consider whether it is necessary to continue paying attention to this issue in the future. Your perspective is very innovative. The AI model should not be sensitive to small differences in data to some extent, and may even enhance its robustness. Of course, this is just my guess.

caochengchen commented 1 month ago

@caochengchen I think what you're saying makes sense. Being tall, short, fat, or thin can indeed affect some performance. After normalization, we tried to minimize the impact by resetting the standard origin. However, some people have longer arms, while others have shorter arms, which can have a slight impact.

For example, it's like there are people who are 1.7 meters tall, but some have longer legs than another person. This impact is a bit small. In previous work, there were many different sign language demonstrators, and in fact, their body proportions were also different. I think our data, after normalization, has at least a slightly smaller error compared to the original video.

We can consider whether it is necessary to continue paying attention to this issue in the future. Your perspective is very innovative. The AI model should not be sensitive to small differences in data to some extent, and may even enhance its robustness. Of course, this is just my guess.

Haha, I thought you didn't see the retracted message. Thank you for your reply. So far, the normalized data you provided looks great. By the way, when can you provide the code for denormalization? If we plan to use your normalized data for SLP, we will eventually need to denormalize for pose reconstruction. Or, do you have another solution in mind? For example, directly reconstructing the pose from the normalized data?

caochengchen commented 1 month ago

@SignLLM I have another question for you. I noticed that the normalized data for Phoenix14T was shared by a user named chenchen in March 2022. I suspect he might be one of your authors. I have checked all the data from the March 2022 version of Phoenix14T and found that it is exactly the same as the data you shared this year, with no differences whatsoever. Could you please confirm whether you had already adopted your optimized normalization data processing method back in March 2022? Or is the version of the data you shared this year incorrect?

SignLLM commented 1 month ago

@caochengchen Yes, I obtained it from the preprocessed data. The SignDiff article mentioned that it was obtained from an open source link. I didn't deliberately extract this German data again later. Strictly speaking, this person's steps are slightly different from mine, as they seem to have more processing involved. But I didn't notice any significant differences between it and the other 7, so I didn't deliberately remake one. You observed it very well.

Additionally, I believe that the author sharing the data is not one of my authors, but rather a coincidental duplicate name. If I have time, I can check it out.

I prefer that the author from 2022 has already used normalization, as its processing steps should be the normal processing steps of a warehouse, which is similar to ours.

SignLLM commented 1 month ago

@SignLLM I have another question for you. I noticed that the normalized data for Phoenix14T was shared by a user named chenchen in March 2022. I suspect he might be one of your authors. I have checked all the data from the March 2022 version of Phoenix14T and found that it is exactly the same as the data you shared this year, with no differences whatsoever. Could you please confirm whether you had already adopted your optimized normalization data processing method back in March 2022? Or is the version of the data you shared this year incorrect?

@caochengchen I think we can separate the code that generates the corresponding video for the model and create a small module specifically to output the video from this standardized data. In this way, different models can also use it to output a normal video.

Strictly speaking, this was one of my initial wishes, but it was too engineering and didn't help with publishing papers, so I didn't do it later.

caochengchen commented 1 month ago

@SignLLM Could you please share the code for denormalization in advance? Since normalized data is used in SLP, it needs to be reconstructed to the original posture for visualization.

SignLLM commented 1 month ago

@SignLLM Could you please share the code for denormalization in advance?

You all mentioned that link, I thought you had run through its code. This link is the code that directly generates a bone pose video from standardized data: https://github.com/BenSaunders27/ProgressiveTransformersSLP/blob/master/plot_videos.py

Note: when I was about to quote earlier, I accidentally deleted a comment that shouldn't have been deleted. sorry

caochengchen commented 1 month ago

@SignLLM Could you please share the code for denormalization in advance?您能否提前分享非规范化的代码?

You all mentioned that link, I thought you had run through its code. This link is the code that directly generates a bone pose video from standardized data: https://github.com/BenSaunders27/ProgressiveTransformersSLP/blob/master/plot_videos.py你们都提到了那个链接,我以为你们已经浏览了它的代码。此链接是从标准化数据直接生成骨骼姿势视频的代码https://github.com/BenSaunders27/ProgressiveTransformersSLP/blob/master/plot_videos.py

Note: when I was about to quote earlier, I accidentally deleted a comment that shouldn't have been deleted. sorry注意:当我之前要引用时,我不小心删除了一条不应该删除的评论。不好意思

No problem, thank you very much for your help.

caochengchen commented 1 month ago

@SignLLM I just found out that you are the author of SignDiff. I read this article a long time ago, and you are really amazing. Could you share the complete code for SignDiff? I think this would be very helpful to me. Thank you. Here is my email address: 17855477582@163.com

SignLLM commented 1 month ago

@SignLLM I just found out that you are the author of SignDiff. I read this article a long time ago, and you are really amazing. Could you share the complete code for SignDiff? I think this would be very helpful to me. Thank you. Here is my email address: 17855477582@163.com

If I'm ready, I'll send it to you, but I feel like I need to upgrade that article now. Because I had just entered the field of sign language at that time, I now feel that there are many areas for optimization. I will think about it later.

caochengchen commented 1 month ago

@SignLLM I just found out that you are the author of SignDiff. I read this article a long time ago, and you are really amazing. Could you share the complete code for SignDiff? I think this would be very helpful to me. Thank you. Here is my email address: 17855477582@163.com

If I'm ready, I'll send it to you, but I feel like I need to upgrade that article now. Because I had just entered the field of sign language at that time, I now feel that there are many areas for optimization. I will think about it later.

Okay, looking forward to your reply. Thank you for your contribution.

caochengchen commented 1 month ago

@SignLLM I used plot_videos.py to process the PHOENIX-2014T dataset you provided, which is the resource mentioned to be obtained from the internet. I directly used plot_videos.py to generate videos from the raw ground truth data for the training and validation sets, but I found that the quality of the generated videos is very poor and they tend to miss key points easily.

SignLLM commented 1 month ago

@caochengchen Please select some generated videos and send them to this email: signllm@googlegroups.com

We have time to take a look. think there might be something wrong with the format of the sequence.

caochengchen commented 1 month ago

@caochengchen Please select some generated videos and send them to this email: signllm@googlegroups.com

We have time to take a look. think there might be something wrong with the format of the sequence.

Alright, it has been sent.

SignLLM commented 1 month ago

@caochengchen Please select some generated videos and send them to this email: signllm@googlegroups.com We have time to take a look. think there might be something wrong with the format of the sequence.

Alright, it has been sent.

@caochengchen I feel a little strange after reading the contents, which should be an abnormal phenomenon. I have seen the pre-processed 14T groundtruth, which is very different from yours, but I don't know which link went wrong, there are too many possible mistakes. This situation is not normal GroundTruth. Maybe you can double-check it, and maybe the next thing you find is that there is a small problem that you didn't find. Some problems don't know why until you've met them.