megagonlabs / sato

Code and data for Sato https://arxiv.org/abs/1911.06311.
Apache License 2.0
108 stars 40 forks source link

Do you have the raw data before converting to 1587 dimensional Sherlock features? #20

Open sux1ngyu opened 3 years ago

sux1ngyu commented 3 years ago

I'm trying to find the original tables for your project; however, I can only find the converted 1587 dimensional features for each columns, not the original strings. Do you have the original strings and where can I find it?

suhara commented 3 years ago

@susuxy We have uploaded the original table files. Please let us know if you have questions. https://github.com/megagonlabs/sato/tree/master/table_data

shivangibithel commented 1 year ago

Hi @suhara

I am unable to download the dataset from the available script. I am getting the following error.

sh download_data.sh % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1351M 100 1351M 0 0 5462k 0 0:04:13 0:04:13 --:--:-- 6003k Archive: tmp.zip creating: tmp/ inflating: tmp/webtables2-p1_type78_par.pkl
inflating: tmp/opendata_type78_header_valid.pkl
inflating: tmp/manyeyes_type78_par.pkl
inflating: tmp/plotly_type78_header_valid.pkl
inflating: tmp/plotly_type78_par.pkl
inflating: tmp/opendata_type78_word.pkl
inflating: tmp/plotly_type78_sherlock_features.pkl
inflating: tmp/webtables0-p3_type274_header_valid.pkl
inflating: tmp/webtables2-p1_sherlock_features.pkl
inflating: tmp/opendata_type78_sherlock_features.pkl
inflating: tmp/opendata_type78_char.pkl
inflating: tmp/opendata_type78_rest.pkl
inflating: tmp/webtables0-p3_type78_header_valid.pkl
inflating: tmp/webtables1-p1_type78_par.pkl
inflating: tmp/webtables2-p1_type78_header_valid.pkl
inflating: tmp/webtables1-p1_type78_num-directstr_thr-0_tn-400.pkl
inflating: tmp/manyeyes_type78_char.pkl
inflating: tmp/manyeyes_type78_header_valid.pkl
inflating: tmp/webtables0-p1_type274_header_valid.pkl
inflating: tmp/manyeyes_type78_rest.pkl
inflating: tmp/manyeyes_type78_word.pkl
inflating: tmp/webtables1-p1_type78_sherlock_features.pkl
inflating: tmp/webtables0-p1_type78_header_valid.pkl
inflating: tmp/manyeyes_sherlock_features.pkl
inflating: tmp/plotly_type78_word.pkl
inflating: tmp/webtables1-p1_type78_word.pkl
inflating: tmp/webtables2-p1_type78_rest.pkl
inflating: tmp/webtables1-p1_type78_header_valid.pkl
inflating: tmp/webtables2-p1_type78_char.pkl
inflating: tmp/webtables2-p1_type78_num-directstr_thr-0_tn-400.pkl
inflating: tmp/webtables1-p1_sherlock_features.pkl
inflating: tmp/plotly_type78_char.pkl
inflating: tmp/webtables1-p1_type78_char.pkl
inflating: tmp/webtables2-p1_type78_word.pkl
inflating: tmp/plotly_type78_rest.pkl
inflating: tmp/webtables1-p1_type78_rest.pkl
inflating: tmp/opendata_type78_par.pkl
download_data.sh: line 8: sherlock/pretrained.zip: No such file or directory download_data.sh: line 9: cd: sherlock: No such file or directory unzip: cannot find or open pretrained.zip, pretrained.zip.zip or pretrained.zip.ZIP. /Users/sato download_data.sh: line 11: topic_model/LDA_cache.zip: No such file or directory download_data.sh: line 12: cd: topic_model: No such file or directory unzip: cannot find or open LDA_cache.zip, LDA_cache.zip.zip or LDA_cache.zip.ZIP. /Users/shivangi rm: sherlock/pretrained.zip: No such file or directory rm: topic_model/LDA_cache.zip: No such file or directory