[PACE] Processing PhotoChat dataset

passing2961 commented 11 months ago

Hi, This is Young-Jun Lee.

I have one question about fine-tuning the proposed pre-trained model on the PhotoChat dataset for the intent prediction task.

Before fine-tuning the model, I execute the write_photochat_intent.py file after downloading the PhotoChat dataset from the official repository. I encounter one problem when I print the result of paths variable in this line. The paths variable is just an empty list.

Can you elaborate on how to process the PhotoChat dataset?

Best regards,

pldlgb commented 11 months ago

Hello~ Thank you for your interest in our work. Have you already replaced the root path on line 115 with the local path?

passing2961 commented 11 months ago

Yes, I replaced the root path with my local path in which the PhotoChat dataset saved.

pldlgb commented 11 months ago

The file storage format should look something like this. You can save it in this format and then give it another try. (base) ➜ photochat ls jsonfile photochat_image_test.tar.gz test validation photochat-image.tar.gz photochatjson.zip train (base) ➜ photochat cd test (base) ➜ test ls 00167b2ac2394762.jpg 3c4bd7b35e4edf0e.jpg 7ba459eb6caa0c65.jpg bd69aac7e2ae893d.jpg 0094a447412998f6.jpg 3c51756b03c6ca93.jpg 7c1cb96ad05f3309.jpg bd9f55c0e9f4e5d4.jpg

passing2961 commented 11 months ago

Thanks for informing :)

When you download images, did you just read the "photo_url" in the PhotoChat dataset JSON file? or did you download image data from CVDF's site directly?

pldlgb commented 11 months ago

just read the "photo_url" in the PhotoChat dataset JSON file～

passing2961 commented 11 months ago

Hello,

Is there any reason to set 30 in here?

Best Regards,

passing2961 commented 11 months ago

Hello,

I found something that I didn't understand well. Well, I might be wrong. So, let's discuss it :)

When I print the last_turn and this_turn variable in this line, the results are:

last_turn -> ''
this_turn -> i am just doing more and more exercises i am good That good i have a nose for exercise as well heres a photo of the fam

The given dialogue is:

{'message': 'Hello how are you?', 'share_photo': False, 'user_id': 0}
{'message': 'hi am fine', 'share_photo': False, 'user_id': 1}
{'message': 'how about you', 'share_photo': False, 'user_id': 1}
{'message': 'Im good just enjoying a drink', 'share_photo': False, 'user_id': 0}
{'message': 'oh nice', 'share_photo': False, 'user_id': 1}
{'message': 'Have had my eye on a cocktail all night', 'share_photo': False, 'user_id': 0}
{'message': 'how about you?', 'share_photo': False, 'user_id': 0}
{'message': 'My head has been spiinning taking care of kids all day now that there is now school', 'share_photo': False, 'user_id': 0}
{'message': 'Hows your head feeling?', 'share_photo': False, 'user_id': 0}
{'message': 'i am just doing more and more exercises', 'share_photo': False, 'user_id': 1}
{'message': 'i am good', 'share_photo': False, 'user_id': 1}
{'message': 'That good i have a nose for exercise as well', 'share_photo': False, 'user_id': 0}
{'message': '', 'share_photo': True, 'user_id': 0}
{'message': 'heres a photo of the fam', 'share_photo': False, 'user_id': 0}

In my understanding when I read the PhotoChat paper, they define the photo-sharing intent prediction task as predicting intent conditioned on the previous dialogue history with speaker information together. But, when I print the result, it seems to include a turn (i.e., "here's a photo of the fam") that appears after sharing the image.

So, I think we need to add the break statement in the code like this:

while idx < len(dialogue) and dialogue[idx]["user_id"] == 0:
        if dialogue[idx]["share_photo"] == True:
            share = True
            break
        if dialogue[idx]["message"]!='':
            user_zero.append(dialogue[idx]["message"])
        idx += 1

What do you think about it? and Should I fix the code like above my suggestion?

Best regards

pldlgb commented 11 months ago

Hello,

Is there any reason to set 30 in here?

Best Regards,

If I'm not mistaken, the length of context in some conversations can be very long. Therefore, we have truncated it here and selected only the last 30 sentences closest to the photo sharing to make the distribution of the entire data more even. You can also choose all the context as the historical information.

pldlgb commented 11 months ago

Hello,

I found something that I didn't understand well. Well, I might be wrong. So, let's discuss it :)

When I print the last_turn and this_turn variable in this line, the results are:
last_turn -> ''
this_turn -> i am just doing more and more exercises i am good That good i have a nose for exercise as well heres a photo of the fam
The given dialogue is:
{'message': 'Hello how are you?', 'share_photo': False, 'user_id': 0}
{'message': 'hi am fine', 'share_photo': False, 'user_id': 1}
{'message': 'how about you', 'share_photo': False, 'user_id': 1}
{'message': 'Im good just enjoying a drink', 'share_photo': False, 'user_id': 0}
{'message': 'oh nice', 'share_photo': False, 'user_id': 1}
{'message': 'Have had my eye on a cocktail all night', 'share_photo': False, 'user_id': 0}
{'message': 'how about you?', 'share_photo': False, 'user_id': 0}
{'message': 'My head has been spiinning taking care of kids all day now that there is now school', 'share_photo': False, 'user_id': 0}
{'message': 'Hows your head feeling?', 'share_photo': False, 'user_id': 0}
{'message': 'i am just doing more and more exercises', 'share_photo': False, 'user_id': 1}
{'message': 'i am good', 'share_photo': False, 'user_id': 1}
{'message': 'That good i have a nose for exercise as well', 'share_photo': False, 'user_id': 0}
{'message': '', 'share_photo': True, 'user_id': 0}
{'message': 'heres a photo of the fam', 'share_photo': False, 'user_id': 0}
In my understanding when I read the PhotoChat paper, they define the photo-sharing intent prediction task as predicting intent conditioned on the previous dialogue history with speaker information together. But, when I print the result, it seems to include a turn (i.e., "here's a photo of the fam") that appears after sharing the image.

So, I think we need to add the break statement in the code like this:
while idx < len(dialogue) and dialogue[idx]["user_id"] == 0:
        if dialogue[idx]["share_photo"] == True:
            share = True
            break
        if dialogue[idx]["message"]!='':
            user_zero.append(dialogue[idx]["message"])
        idx += 1
What do you think about it? and Should I fix the code like above my suggestion?

Best regards

Because the authors of PhotoChat did not open-source their code, at that time, we mainly referred to the formula in the paper: ∀j ∈ [1, h], C(t1:j , s1:j ) ∈ {0, 1}. So, I think adding "break" should also be feasible, but the impact of this change might not be significant.

    {
      "message": "That is great that you finally got to see him again. Sure!",
      "share_photo": false,
      "user_id": 1
    },
    {
      "message": "The photo is a bit dark. But the man in the front is Gianni.",
      "share_photo": false,
      "user_id": 0
    },
    {
      "message": "",
      "share_photo": true,
      "user_id": 0
    },
    {
      "message": "Even though the photo is dark, it is still a great photo!",
      "share_photo": false,
      "user_id": 1
    }

   {
      "message": "okay",
      "share_photo": false,
      "user_id": 1
    },
    {
      "message": "Here's a pic//",
      "share_photo": false,
      "user_id": 0
    },
    {
      "message": "",
      "share_photo": true,
      "user_id": 0
    },
    {
      "message": "hey interesting",
      "share_photo": false,
      "user_id": 1
    },

passing2961 commented 11 months ago

Thanks for your kind response!

It is really helpful to me!

AlibabaResearch / DAMO-ConvAI

[PACE] Processing PhotoChat dataset #62