shirley-wu / text_to_table

MIT License
76 stars 14 forks source link

How to understand some config #13

Open jiangpw41 opened 5 months ago

jiangpw41 commented 5 months ago

Hello, I am reproducing your paper and there are some areas where I am a bit troubled. (1) Is HAD model an abbreviation for High Availability Design model, also known as "Our method" in the paper? (2) Is the script test_vanilla.sh used for both vanilla and "our method"/HAD testing? (3) Is the test_constraint.sh script used for ablation experiments that are not suitable for "table constraint" technology, and how can "table relationship embeddings: be eliminated? (4) Why is the -- task parameter in scripts for training vanilla and HAD models in Rotowire "translation" and "text_to_table_task", respectively, but in both test_constraint.sh and test_vanilla.sh scripts "text_to_table_task"? (5) How to determine the parameter setting of TOTAL_NUM_UPDATES="8000" for traning? Your help will be deeply appreciated.

jiangpw41 commented 5 months ago

I may have obtained the answer to (2) above, as the test results are similar to the paper.

shirley-wu commented 4 months ago

Hello, sorry for the late response!

(1) Sorry for the confusing notation. HAD refers to model with table relation embeddings. In our paper "our method" refers to both table relation embeddings and table constraints. We'll update the document to make that clear

(2) test_vanilla.sh refers to evaluation without table constraints. It can be used to evaluate both baseline model and model with table relationship embeddings

(3) test_constraint.sh refers to evaluation with table constraints. So using test_constraint.sh to evaluate table relationship embeddings is our method, rather than an ablation

(4) The --task parameter is "text_to_table_task" when we need to use the custom code in src/tasks/text_to_table_task.py, mainly parsing the table structure of output table

(5) 8000 is a hyperparameter, we tuned it to find a good setting

Hope it's still helpful!