Closed rucjrliu closed 3 years ago
Can you show us which command did you use to generate the workload? Are these 2 methods using the same workload? This assertion is to make sure the number and value of all input queries and ground truths are the same.
Workload Generation Command: just wkld-gen-base census13 original name='base'
The 2 methods use the same workload.
PS: I find that if I set the use_cache to False, it is OK. Can I ask why?
Thanks.
Workload Generation Command: just wkld-gen-base census13 original name='base'
The 2 methods use the same workload.
PS: I find that if I set the use_cache to False, it is OK. Can I ask why?
Thanks.
We cache the encoded features and labels for both lw-nn
and lw-tree
for convenience (e.g. do not need to query the database every time to get CE features). During training, we will directly load this cache if it exists otherwise we will create one. For test, if use_cache
is True
, we do the same and also add this assertion line to make sure the cached workload is align with the testing one. If you set use_cache
to False
then we won't load the cache and encoding for each test query will be generated on the fly so there won't be any check. The cache creation and loading logic can be found in lecarb/estimator/lw/common.py: load_lw_dataset
.
In your case there is a mismatch between your cached workload and the workload you are testing. Can you try to delete the cache file (located in the lw
directory for each dataset) and run train and test again?
Workload Generation Command: just wkld-gen-base census13 original name='base' The 2 methods use the same workload. PS: I find that if I set the use_cache to False, it is OK. Can I ask why? Thanks.
We cache the encoded features and labels for both
lw-nn
andlw-tree
for convenience (e.g. do not need to query the database every time to get CE features). During training, we will directly load this cache if it exists otherwise we will create one. For test, ifuse_cache
isTrue
, we do the same and also add this assertion line to make sure the cached workload is align with the testing one. If you setuse_cache
toFalse
then we won't load the cache and encoding for each test query will be generated on the fly so there won't be any check. The cache creation and loading logic can be found inlecarb/estimator/lw/common.py: load_lw_dataset
.In your case there is a mismatch between your cached workload and the workload you are testing. Can you try to delete the cache file (located in the
lw
directory for each dataset) and run train and test again?
I followed your words and retried. It is OK now! Thank you for your patient and clear explain!
I run the program following the hyper-params.md:
lw-tree census13
just train-lw-tree census13 original base 64 200 100000 0 123 just test-lw-tree original_base-lwxgb_tr64_bin200_100k-123 census13 original base True 123
lw-nn census13
just train-lw-nn census13 original base 64_64_64 200 100000 128 0 123 just test-lw-nn original_base-lwnn_hid64_64_64_bin200_ep500_bs128_100k-123 census13 original base True 123
The train is OK. But when it comes to the test, AssertionError happened and then the program exit.
Expect to your reply. Thanks a lot!