cici-ai-club / 3M

A codebase for multi-style image captions
Apache License 2.0
3 stars 1 forks source link

How to generate densecap json file? #3

Open shreyassks opened 1 year ago

shreyassks commented 1 year ago

Hi, I wanted to know how to generate this file "densecap/img_caption.json".

Also I am trying to write an inference script for the best-model.pth file provided. Should I load the state dict of the weights into densepembedAttModel model class provided in the scripts?

shreyassks commented 1 year ago

Can you also provide the config parameters used in the model? I cant find the Opt dictionary anywhere

cici-ai-club commented 1 year ago

Hey Shreyas, Thanks for reaching out and I will try to provide more instructions on this coming weekends.

On Sat, Mar 11, 2023, 10:08 AM Shreyas SK @.***> wrote:

Can you also provide the config parameters used in the model? I cant find the Opt dictionary anywhere

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1464931247, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMKTOKJIYSFGTQEOXVTW3SIQBANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

shreyassks commented 1 year ago

Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks

cici-ai-club commented 1 year ago

Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks

Hi Shreyas, I followed the below repo to generate densecap/img_caption.json https://github.com/jcjohnson/densecap

shreyassks commented 1 year ago

Hi,

Thanks for the information. Can you please figure out a way to share the already generated file. I'll let you know once I download it. You can later remove it as its difficult to get free storage now a days.

Thanks and regards Shreyas

On Sun, 19 Mar 2023, 10:13 CAIC, @.***> wrote:

Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks

Hi Shreyas, I followed the below repo to generate densecap/img_caption.json The format should be like {"image_name":image_features}, image_features are directly extracted from the model from the below repo using the PERSONALITY CAPTION dataset. https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap If you need my densecap/img_caption.json for the PERSONALITY CAPTION dataset, I have some generated ones already, they cannot be directly shared through GitHub due to their size. If you need, I might have to see if other places offer free space for me to upload. Let me know if you need that, I can help to upload it if I found some free storage. The original places I uploaded in README have expired already.

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1475102359, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.***>

cici-ai-club commented 1 year ago

Hi, Thanks for the information. Can you please figure out a way to share the already generated file. I'll let you know once I download it. You can later remove it as its difficult to get free storage now a days. Thanks and regards Shreyas On Sun, 19 Mar 2023, 10:13 CAIC, @.> wrote: Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks Hi Shreyas, I followed the below repo to generate densecap/img_caption.json The format should be like {"image_name":image_features}, image_features are directly extracted from the model from the below repo using the PERSONALITY CAPTION dataset. https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap If you need my densecap/img_caption.json for the PERSONALITY CAPTION dataset, I have some generated ones already, they cannot be directly shared through GitHub due to their size. If you need, I might have to see if other places offer free space for me to upload. Let me know if you need that, I can help to upload it if I found some free storage. The original places I uploaded in README have expired already. — Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.> Sure, let me try to find and upload them. Then I will let you know

shreyassks commented 1 year ago

Also I am trying to write an inference script for the best-model.pth file provided. Should I load the state dict of the weights into densepembedAttModel model class provided in the scripts?

Thanks and regards Shreyas

On Sun, 19 Mar 2023, 10:39 CAIC, @.***> wrote:

Hi, Thanks for the information. Can you please figure out a way to share the already generated file. I'll let you know once I download it. You can later remove it as its difficult to get free storage now a days. Thanks and regards Shreyas … <#m2462865647210755363> On Sun, 19 Mar 2023, 10:13 CAIC, @.> wrote: Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks Hi Shreyas, I followed the below repo to generate densecap/img_caption.json The format should be like {"image_name":image_features}, image_features are directly extracted from the model from the below repo using the PERSONALITY CAPTION dataset. https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap If you need my densecap/img_caption.json for the PERSONALITY CAPTION dataset, I have some generated ones already, they cannot be directly shared through GitHub due to their size. If you need, I might have to see if other places offer free space for me to upload. Let me know if you need that, I can help to upload it if I found some free storage. The original places I uploaded in README have expired already. — Reply to this email directly, view it on GitHub <#3 (comment) https://github.com/cici-ai-club/3M/issues/3#issuecomment-1475102359>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.> Sure, let me try to find and upload them. Then I will let you know

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1475107540, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDMCHQSH6M5MXZ5K4K3W42IKFANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.***>

chengxili commented 1 year ago

Also I am trying to write an inference script for the best-model.pth file provided. Should I load the state dict of the weights into densepembedAttModel model class provided in the scripts? Thanks and regards Shreyas On Sun, 19 Mar 2023, 10:39 CAIC, @.> wrote: Hi, Thanks for the information. Can you please figure out a way to share the already generated file. I'll let you know once I download it. You can later remove it as its difficult to get free storage now a days. Thanks and regards Shreyas … <#m2462865647210755363> On Sun, 19 Mar 2023, 10:13 CAIC, @.> wrote: Hi, Just leaving a comment as a reminder. Please try to provide instructions this weekend. Thanks Hi Shreyas, I followed the below repo to generate densecap/img_caption.json The format should be like {"image_name":image_features}, image_features are directly extracted from the model from the below repo using the PERSONALITY CAPTION dataset. https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap https://github.com/jcjohnson/densecap If you need my densecap/img_caption.json for the PERSONALITY CAPTION dataset, I have some generated ones already, they cannot be directly shared through GitHub due to their size. If you need, I might have to see if other places offer free space for me to upload. Let me know if you need that, I can help to upload it if I found some free storage. The original places I uploaded in README have expired already. — Reply to this email directly, view it on GitHub <#3 (comment) <#3 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU https://github.com/notifications/unsubscribe-auth/AKREPDNEXKXQYBZDPHNB2LDW42FGZANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.> Sure, let me try to find and upload them. Then I will let you know — Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKREPDMCHQSH6M5MXZ5K4K3W42IKFANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you authored the thread.Message ID: @.> Hi Shreyas, I have uploaded densecap/img_caption.json to the below drive, and make sure you create a folder called densecap and put the file inside it. The trained model is also inside the drive and specifically inside log_added_new, download them, follow my instruction in the readme, and you should able to execute all steps. https://drive.google.com/drive/u/4/folders/170palQ7QzRsY2ZRyaDTIQAcdHyuVZsFe

chengxili commented 1 year ago

Hi Shreyas, I have uploaded densecap/img_caption.json to the below drive, and make sure you create a folder called densecap and put the file inside it. The trained model is also inside the drive and specifically inside log_added_new, download them, follow my instruction in the readme, and you should able to execute all steps. https://drive.google.com/drive/u/4/folders/170palQ7QzRsY2ZRyaDTIQAcdHyuVZsFe

Hi Shreyas, I have uploaded densecap/img_caption.json to the below drive, and make sure you create a folder called densecap and put the file inside it. The trained model is also inside the drive and specifically inside log_added_new, download them, follow my instruction in the readme, and you should able to execute all steps. https://drive.google.com/drive/u/4/folders/170palQ7QzRsY2ZRyaDTIQAcdHyuVZsFe

shreyassks commented 1 year ago

Thank you! I'll let you know once I download the file

shreyassks commented 1 year ago

Hi Cheng,

Thanks for all the help. I was able to create an inference script using the best-model.pth artifact provided. However the generated captions doesn't make much of sense for each personality trait. I have mentioned one of the sample below. I'm not sure if I'm missing out something. Please help me out

Trait - Intense Generated Caption :- this shocked the critter beautiful knees i ' ve towels seen in my famous growing ! Ground Truth Caption:- the snow will last as long as my sadness

Screenshot 2023-03-29 at 12 53 42 PM

Inference Code

` ofc_feats = ds[index][0].to(device) oatt_feats = ds[index][1].unsqueeze(0).to(device) dcap = ds[index][2] dcap = torch.from_numpy(np.int32(dcap)).to(device).unsqueeze(0) attention_mask = torch.ones(1, 8) traits = torch.from_numpy(np.int32(ds[index][4])).to(device).unsqueeze(0)

pred, _= model._sample_beam(ofc_feats, oatt_feats, dcap, attention_mask, personality=traits) pred = pred.detach().cpu().numpy().tolist() g_cap = " ".join(data[index]["sentence"][0])

id = traits.detach().cpu().numpy() personality_index = np.where(id==1)[1][0] personality = ds.info["pix_to_personality"][str(personality_index)] g_personality = data[index]["personality"]

cap = "" for i in pred[0]: if i != 0: cap = cap + " " + id2word["ix_to_word"][str(i)] print(f"Generated Caption for Personality: {personality} is :- {cap}\n") print(f"Ground Truth Caption for Personality: {g_personality} is:- {g_cap}")

image_name = data[index]["id"]+".jpg" Image.open(f"../ParlAI/images/train_images/{image_name}") `

cici-ai-club commented 1 year ago

Hey I am not sure if you use the same personality index as mine. Please first make sure everything there is correct. Please try some simple one, like happy, sad first. Try the test data from my dataset first to verify things. Thank you. Chengxi

On Wed, Mar 29, 2023 at 12:27 AM Shreyas SK @.***> wrote:

Hi Cheng,

Thanks for all the help. I was able to create an inference script using the best-model.pth artifact provided. However the generated captions doesn't make much of sense for each personality trait. I have mentioned one of the sample below. I'm not sure if I'm missing out something. Please help me out

Trait - Intense Generated Caption :- this shocked the critter beautiful knees i ' ve towels seen in my famous growing ! Ground Truth Caption:- the snow will last as long as my sadness

[image: Screenshot 2023-03-29 at 12 53 42 PM] https://user-images.githubusercontent.com/44189581/228457714-7106c1f2-a63b-4bc3-9aa3-2fd4adada43d.png

Inference Code ` ofc_feats = ds[index][0].to(device) oatt_feats = ds[index][1].unsqueeze(0).to(device) dcap = ds[index][2] dcap = torch.from_numpy(np.int32(dcap)).to(device).unsqueeze(0) attention_mask = torch.ones(1, 8) traits = torch.from_numpy(np.int32(ds[index][4])).to(device).unsqueeze(0)

pred, _= model._sample_beam(ofc_feats, oatt_feats, dcap, attention_mask, personality=traits) pred = pred.detach().cpu().numpy().tolist() g_cap = " ".join(data[index]["sentence"][0])

id = traits.detach().cpu().numpy() personality_index = np.where(id==1)[1][0] personality = ds.info["pix_to_personality"][str(personality_index)] g_personality = data[index]["personality"]

cap = "" for i in pred[0]: if i != 0: cap = cap + " " + id2word["ix_to_word"][str(i)] print(f"Generated Caption for Personality: {personality} is :- {cap}\n") print(f"Ground Truth Caption for Personality: {g_personality} is:- {g_cap}")

image_name = data[index]["id"]+".jpg" Image.open(f"../ParlAI/images/train_images/{image_name}")`

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1488079508, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMLQC67JITF5ZH76AMDW6PP7PANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

shreyassks commented 1 year ago

I am using the same personality index as yours. I even tried with the test set but no luck. Still the generated captions doesn't make sense

cici-ai-club commented 1 year ago

The test set should make sense. I am not sure where goes wrong. You can check with some happy personality with dog picture first to see if your side works. Also check whether the caption you got make sense.

On Thu, Mar 30, 2023, 2:18 AM Shreyas SK @.***> wrote:

I am using the same personality index as yours. I even tried with the test set but no luck. Still the generated captions doesn't make sense

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1489972913, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMLHZUM473BPHT44KG3W6VFXVANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

Nischal-Newar commented 1 year ago

If possible could you please upload the extracted features from ResNext?

shreyassks commented 1 year ago

Extracted Features is about 80 GB. It's too difficult to upload it. You can extract those features yourself easily. You can take help from the below link to extract features. Just replace the model tag https://huggingface.co/docs/timm/main/en/feature_extraction

Nischal-Newar commented 1 year ago

Extracted Features is about 80 GB. It's too difficult to upload it. You can extract those features yourself easily. You can take help from the below link to extract features. Just replace the model tag https://huggingface.co/docs/timm/main/en/feature_extraction

How can I use Hugging Face's feature_extraction instead of ParlAI's? I am currently using ParlAI, but not all images are being downloaded. As a result, the extraction process stops at a specific file.

Do you have the complete dataset of PERSONALITY-CAPTIONS? Are you using the FlickerStyle10K instead of YFCC or both?

shreyassks commented 1 year ago

I'm using only YFCC. You can find the json files of train, val and test sets from ParlAI. There are Image URLs in those files. Use them to download the complete dataset.

cici-ai-club commented 1 year ago

Hey, yes ParlAI won't work unless you add some code to avoid those images. All the missing ones I found is starting with ac8. So add code to avoid ParlAI to generate features if hash contained ac8*.

On Mon, Apr 10, 2023, 7:21 AM Nischal Newar @.***> wrote:

I believe the train, val, and test sets only contain information about the image hash. I can avoid the 50 files that are missing and extract the features using the images I have. However, ParlAI won't work unless I have the remaining files. Can you guide me with the commands you used for Hugging Face to extract mean pool features and spatial features?

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1501878416, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMMNKMH5KN65DIYQC53XAQJNNANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

cici-ai-club commented 1 year ago

I should be able to help this one. Have you figured it out? Also have you tried to use preprocess steps I provided in the bash script. Let me know if you still need it. I will have to upload to you in 1 week when I get back home.

Chengxi

On Sun, Apr 2, 2023, 2:38 AM Shreyas SK @.***> wrote:

I have tried to evaluate using the test split but it seems this file is missing "data/person-test-words.p". Can you please upload it in the drive

File "/home/ec2-user/SageMaker/3M/denseeval3m.py", line 82, in loss, split_predictions, lang_stats = eval_utils.eval_split(model, crit, loader, vars(opt)) File "/home/ec2-user/SageMaker/3M/deneval_utils3m.py", line 111, in eval_split init_scorer('person-'+split+'-words') File "/home/ec2-user/SageMaker/3M/misc/rewards.py", line 25, in init_scorer CiderD_scorer = CiderD_scorer or CiderD(df=cached_tokens) File "/home/ec2-user/SageMaker/3M/cider/pyciderevalcap/ciderD/ciderD.py", line 28, in init self.cider_scorer = CiderScorer(n=self._n, df_mode=self._df) File "/home/ec2-user/SageMaker/3M/cider/pyciderevalcap/ciderD/ciderD_scorer.py", line 80, in init pkl_file = cPickle.load(open(os.path.join('data', df_mode + '.p'),'rb'), **(dict(encoding='latin1') if six.PY3 else {})) FileNotFoundError: [Errno 2] No such file or directory: 'data/person-test-words.p'

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1493280099, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMK65YNKQJYEPXHZXZTW7FCJJANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

cici-ai-club commented 1 year ago

Hey Shreyas, I think if you used different feature set, you need to retrain the model for generation following the training steps. Otherwise, the trained model would be confused with the input.

Chengxi

On Sun, Apr 9, 2023, 7:59 PM Shreyas SK @.***> wrote:

Extracted Features is about 80 GB. It's too difficult to upload it. You can extract those features yourself easily. You can take help from the below link to extract features. Just replace the model tag https://huggingface.co/docs/timm/main/en/feature_extraction

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1501327952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMNQMVAX3WLOKUI7LOTXANZSTANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

shreyassks commented 1 year ago

Hi, I was able to generate captions now. The vocabulary was indexed from 1. So, i had to subtract the index by 1 during inference to get meaningful captions.

Although from the command you've provided I'm able to train but it is not using SCST after few epochs and I'm not sure if I had to tune any hyper parameters to get the best model. Could you share with me the hyperparameters that have been used to obtain the best model?

Nischal-Newar commented 1 year ago

Hey, yes ParlAI won't work unless you add some code to avoid those images. All the missing ones I found is starting with ac8. So add code to avoid ParlAI to generate features if hash contained ac8*. On Mon, Apr 10, 2023, 7:21 AM Nischal Newar @.> wrote: I believe the train, val, and test sets only contain information about the image hash. I can avoid the 50 files that are missing and extract the features using the images I have. However, ParlAI won't work unless I have the remaining files. Can you guide me with the commands you used for Hugging Face to extract mean pool features and spatial features? — Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMMNKMH5KN65DIYQC53XAQJNNANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.>

I am using the feature extraction script provided by ParlAI, but I have not configured the development environment for ParlAI, so I am unable to modify the script to exclude the 'ac8' files. However, I am removing the records containing 'ac8' from the train, test, and validation files. I hope this will resolve the issue.

shreyassks commented 1 year ago

Hey Chengxi,

I have evaluated the best model with test split. Below are the results {'Bleu_1': 0.42620738153029025, 'Bleu_2': 0.20458454245542831, 'Bleu_3': 0.11215662186393982, 'Bleu_4': 0.0657656883734098, 'METEOR': 0.11409083638519023, 'ROUGE_L': 0.2693195972647437, 'CIDEr': 0.09645846960478273, 'SPICE': 0.03508444086667843, 'bad_count_rate': 0.0006011421701232341}

The metrics above are not matching with the ones provided in the paper. Can you please let me know if i need to tune any parameter to achieve the same results as in paper?

cici-ai-club commented 1 year ago

Hey Shreyas, The hyperparamter should be in Readme. Did you follow the same training process I suggested?

On Wed, Apr 12, 2023, 9:38 PM Shreyas SK @.***> wrote:

Hey Chengxi,

I have evaluated the best model with test split. Below are the results {'Bleu_1': 0.42620738153029025, 'Bleu_2': 0.20458454245542831, 'Bleu_3': 0.11215662186393982, 'Bleu_4': 0.0657656883734098, 'METEOR': 0.11409083638519023, 'ROUGE_L': 0.2693195972647437, 'CIDEr': 0.09645846960478273, 'SPICE': 0.03508444086667843, 'bad_count_rate': 0.0006011421701232341}

The metrics above are not matching with the ones provided in the paper. Can you please let me know if i need to tune any parameter to achieve the same results as in paper?

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1506495578, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMKX2PQYQTC7AHBHNLLXA6UQLANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

cici-ai-club commented 1 year ago

Very small batch size would result very different training results with paper. With my training I would need 30GiB of GPU memory. Suggestions are go cloud training or turn of GUI

On Thu, Apr 13, 2023, 6:58 AM Nischal Newar @.***> wrote:

Hey,

I need help with the Memory issue. The current batch size I am using is 2 and going to 1 is creating an issue. Do you have any suggestions?

"torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 8.00 GiB total capacity; 7.20 GiB already allocated; 0 bytes free; 7.23 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1507294373, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMKCXPKYZ6HVJZYPTDTXBAWB3ANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

mpremashish commented 1 year ago

Hey Chengxi,

Do you have the img_caption.json corresponding to flickr10K data (result of dense_cap) ?

Thank you

shreyassks commented 1 year ago

Hi Chengxi,

I have used the best model weights provided in the repo and evaluated it based on the command provided in readme

cici-ai-club commented 1 year ago

Is your image Resnext features the same as mine? I am not sure where your side is different from mine. Chengxi

On Fri, Apr 14, 2023, 7:45 PM Shreyas SK @.***> wrote:

Hi Chengxi,

I have used the best model weights provided in the repo and evaluated it based on the command provided in readme

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1509547634, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMPWG5JAAUUC7BEYWDLXBIYY3ANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

cici-ai-club commented 1 year ago

I have to check later 1 week later after my current vacation. Should have before, not sure lost access or not. You can always use dense cap github to generate.

On Fri, Apr 14, 2023, 4:35 PM mpremashish @.***> wrote:

Hey Chengxi,

Do you have the img_caption.json corresponding to flickr10K data (result of dense_cap) ?

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1509473421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMMR5KEUQUVKSIOMGGDXBICPFANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

shreyassks commented 1 year ago

Yes, I could not download the features from ParlAI but i took help from the below scripts in ParlAI to generate features. https://github.com/facebookresearch/ParlAI/blob/main/parlai/core/image_featurizers.py

Mean pooled features - 'resnext101_32x48d_wsl': ['resnext101_32x48d_wsl', -1] Spatial features - 'resnext101_32x48d_wsl_spatial': ['resnext101_32x48d_wsl', -2]

Let me know if these are the same as yours

cici-ai-club commented 1 year ago

Hey Shreyas, I know the potential issues. The features I get from ParAI is 2019. At 2022, I used the same api from and generate the same features under the same hash, but the features are slightly different.. here I meant slightly different is most of the features in most dimensions are the same but some cells are not. I have emailed the author and he doesn't know why this could happen. So I kept used the older version features. I am guessing is becasue pytorch improving causing float accuracy changed. This could cause the neural network resulting different sentences due to training features are all slightly different too. Things I could do is I can try to find the testing features I got from 2019. Then you can test on the features I got.

Or if you want, you can first examine whether the sentence from your test set makes sense or not. This would explain whether there is any input issues or from the issue I reported above.

Another thing you can do is go ahead retraining using the 3M under new features and see what scores would happen there.

On Sat, Apr 15, 2023, 8:26 PM Shreyas SK @.***> wrote:

Yes, I could not download the features from ParlAI but i took help from the below scripts in ParlAI to generate features.

https://github.com/facebookresearch/ParlAI/blob/main/parlai/core/image_featurizers.py

Mean pooled features - 'resnext101_32x48d_wsl': ['resnext101_32x48d_wsl', -1] Spatial features - 'resnext101_32x48d_wsl_spatial': ['resnext101_32x48d_wsl', -2]

Let me know if these are the same as yours

— Reply to this email directly, view it on GitHub https://github.com/cici-ai-club/3M/issues/3#issuecomment-1510108601, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI5ABMOH6HQ3HTQZJM2GQ5LXBOGKDANCNFSM6AAAAAAVXQ7ETU . You are receiving this because you commented.Message ID: @.***>

shreyassks commented 1 year ago

Thanks a lot for pointing out the issue. Please share with me the test features you've got.

I notice you've used Supervised training, although I can find the SCST loss function, I just wanted you're confirmation whether you've used SCST approach as well after few epochs training of Cross Entropy loss? If so, how many epochs of SCST did you try with?