Open fzyzcjy opened 1 year ago
Is this k=3 l=7?
Here is the command:
CUDA_VISIBLE_DEVICES=2 python -m int_environment.algos.main --combo_path data/benchmark/field --dump /data/jychen/misc/int_code_results/pt_models/ --online --train_sets k\=5_l\=5 --test_sets k\=5_l\=5 --epochs_per_online_dataset 10 --num_probs 1000 --lr 1e-4 --updates 1000000 --transform_gt --degree 0 --seed 0 --epoch_per_case_record 200
Thus I guess it is K=5 L=5
Basically it is the command from README.md but with full names instead of brief names in the command line arguments
And I am using https://github.com/fzyzcjy/INT (code diff: https://github.com/albertqjiang/INT/pull/15/files), but IMHO there is no semantics changes except for some logging etc
Ah I think there could be two issues: (1) I used a larger number of axiom combinations and orders in the paper than shown in the example (otherwise the example would be too slow).
(2) Figure 2 should be for both equalities and inequalities. Therefore one needs to generate ordered_field axiom combos and orders.
i.e., when generating the axiom combinations and orders, try
python -m int_environment.data_generation.combos_and_orders --combo_path data/benchmark/ordered_field --max_k 5 --max_l 5 --trial 1000000
. This should take ~20 mins if memory serves.
Thank you for the information!
So I will try the following commands and report results after it is finished :)
python -m int_environment.data_generation.combos_and_orders --combo_path /data/jychen/misc/int_code_results/data_benchmark_k5_l5/ordered_field --max_k 5 --max_l 5 --trial 1000000
CUDA_VISIBLE_DEVICES=2 python -m int_environment.algos.main --combo_path /data/jychen/misc/int_code_results/data_benchmark_k5_l5/ordered_field --dump /data/jychen/misc/int_code_results/pt_models/ --online --train_sets k\=5_l\=5 --test_sets k\=5_l\=5 --epochs_per_online_dataset 10 --num_probs 1000 --lr 1e-4 --updates 1000000 --transform_gt --degree 0 --seed 0 --epoch_per_case_record 200 | tee /data/jychen/misc/int_code_results/logs/$(date +%s).log
In addition, I would appreciate it if I could know how to reproduce the "Transformer" results? IMHO the readme does not mention any commands to train a transformer or seq2seq model (except for data generation).
@fzyzcjy Hi, do you know how to reproduce the transformer result now? I'm also interested in it.
@venom12138 Hi, I did not reproduce it either, and later tried other papers.
@albertqjiang Hi thanks for the paper with code! I wonder what is the correct command to trigger training that can reproduce the accuracy numbers in the figures (e.g. figure 2 left)?
I have tried the command in
training
section of README, but it looks quite different than the reported results in the paper. For example, soon it starts to have very high training accuracy (~94%), as can be seen from the following (collapsed) code block, contrary to roughly 75% in figure 2 left.I will try to play with the code a little bit more, but want to firstly create this issue as a quick feedback that the README (or the code?) may not be super consistent with the paper :/