thunlp / VisualDS

MIT License
25 stars 3 forks source link

about the baseline model of your paper? #9

Closed ZHUXUHAN closed 1 year ago

ZHUXUHAN commented 1 year ago
  1. For the first line of the Table.2, how can i train the model, can you provide the sripts?
  2. For the limited labels method in Table.1, did you train the model? can you share how did you train it, because i can't find the open source code of the method.
waxnkw commented 1 year ago
  1. The first line of Table 2 is cmds/20/motif/task/ds/em_M_step1.sh which is also the result of line 3 in Table2.
  2. The code is here https://github.com/vincentschen/limited-label-scene-graphs.
ZHUXUHAN commented 1 year ago
  1. The first line of Table 2 is cmds/20/motif/task/ds/em_M_step1.sh which is also the result of line 3 in Table2.
  2. The code is here https://github.com/vincentschen/limited-label-scene-graphs.

yes, for the question-1, i just understand it, and i train the model, but it is lower 2 points than your reported results, this is confused.

for the question-2, i just follow this project to try to generate the data, but i see the generared labels' quality are poor.

waxnkw commented 1 year ago

Q1: I reorganize the code after getting accepted. I do not try to make sure that every baseline is totally the same as the results in the paper. But, I think the overall conclusion is the same.

There are also some other elements that will affect the performance. For example, the GCC version will sometimes affect the final performance like what I am shown here.

mR@K metric is not stable. Some predicate classes have only a few samples. Thus, very few changes of the correctly predicted samples on the tail classes will lead to relatively big fluctuations.

Q2: Yes, I think the quality of the generated labels is highly dependent on the random seed and not good all the time.

ZHUXUHAN commented 1 year ago

Q1: I reorganize the code after getting accepted. I do not try to make sure that every baseline is totally the same as the results in the paper. But, I think the overall conclusion is the same.

There are also some other elements that will affect the performance. For example, the GCC version will sometimes affect the final performance like what I am shown here.

mR@K metric is not stable. Some predicate classes have only a few samples. Thus, very few changes of the correctly predicted samples on the tail classes will lead to relatively big fluctuations.

Q2: Yes, I think the quality of the generated labels is highly dependent on the random seed and not good all the time.

thank you very much for your detailed answering this is very helpful.