HazyResearch / bootleg

Self-Supervision for Named Entity Disambiguation at the Tail
http://hazyresearch.stanford.edu/bootleg
Apache License 2.0
214 stars 27 forks source link

bug of example #88

Closed lshowway closed 3 years ago

lshowway commented 3 years ago

`from bootleg.end2end.bootleg_annotator import BootlegAnnotator

ann = BootlegAnnotator()
t = ann.label_mentions("Bob Dylan release Desire")["titles"]`

The bug is: usage: process_STS_dev.py [-h] [--emmental.seed EMMENTAL.SEED] [--emmental.verbose EMMENTAL.VERBOSE] [--emmental.log_path EMMENTAL.LOG_PATH] [--emmental.use_exact_log_path EMMENTAL.USE_EXACT_LOG_PATH] [--emmental.min_data_len EMMENTAL.MIN_DATA_LEN] [--emmental.max_data_len EMMENTAL.MAX_DATA_LEN] [--emmental.model_path EMMENTAL.MODEL_PATH] [--emmental.device EMMENTAL.DEVICE] [--emmental.dataparallel EMMENTAL.DATAPARALLEL] [--emmental.distributed_backend {nccl,gloo}] [--emmental.fp16 EMMENTAL.FP16] [--emmental.fp16_opt_level EMMENTAL.FP16_OPT_LEVEL] [--emmental.local_rank EMMENTAL.LOCAL_RANK] [--emmental.n_epochs EMMENTAL.N_EPOCHS] [--emmental.train_split EMMENTAL.TRAIN_SPLIT [EMMENTAL.TRAIN_SPLIT ...]] [--emmental.valid_split EMMENTAL.VALID_SPLIT [EMMENTAL.VALID_SPLIT ...]] [--emmental.test_split EMMENTAL.TEST_SPLIT [EMMENTAL.TEST_SPLIT ...]] [--emmental.ignore_index EMMENTAL.IGNORE_INDEX] [--emmental.online_eval EMMENTAL.ONLINE_EVAL] [--emmental.optimizer {asgd,adadelta,adagrad,adam,adamw,adamax,lbfgs,rms_prop,r_prop,sgd,sparse_adam,bert_adam,None}] [--emmental.lr EMMENTAL.LR] [--emmental.l2 EMMENTAL.L2] [--emmental.grad_clip EMMENTAL.GRAD_CLIP] [--emmental.gradient_accumulation_steps EMMENTAL.GRADIENT_ACCUMULATION_STEPS] [--emmental.asgd_lambd EMMENTAL.ASGD_LAMBD] [--emmental.asgd_alpha EMMENTAL.ASGD_ALPHA] [--emmental.asgd_t0 EMMENTAL.ASGD_T0] [--emmental.adadelta_rho EMMENTAL.ADADELTA_RHO] [--emmental.adadelta_eps EMMENTAL.ADADELTA_EPS] [--emmental.adagrad_lr_decay EMMENTAL.ADAGRAD_LR_DECAY] [--emmental.adagrad_initial_accumulator_value EMMENTAL.ADAGRAD_INITIAL_ACCUMULATOR_VALUE] [--emmental.adagrad_eps EMMENTAL.ADAGRAD_EPS] [--emmental.adam_betas EMMENTAL.ADAM_BETAS [EMMENTAL.ADAM_BETAS ...]] [--emmental.adam_eps EMMENTAL.ADAM_EPS] [--emmental.adam_amsgrad EMMENTAL.ADAM_AMSGRAD] [--emmental.adamw_betas EMMENTAL.ADAMW_BETAS [EMMENTAL.ADAMW_BETAS ...]] [--emmental.adamw_eps EMMENTAL.ADAMW_EPS] [--emmental.adamw_amsgrad EMMENTAL.ADAMW_AMSGRAD] [--emmental.adamax_betas EMMENTAL.ADAMAX_BETAS [EMMENTAL.ADAMAX_BETAS ...]] [--emmental.adamax_eps EMMENTAL.ADAMAX_EPS] [--emmental.lbfgs_max_iter EMMENTAL.LBFGS_MAX_ITER] [--emmental.lbfgs_max_eval EMMENTAL.LBFGS_MAX_EVAL] [--emmental.lbfgs_tolerance_grad EMMENTAL.LBFGS_TOLERANCE_GRAD] [--emmental.lbfgs_tolerance_change EMMENTAL.LBFGS_TOLERANCE_CHANGE] [--emmental.lbfgs_history_size EMMENTAL.LBFGS_HISTORY_SIZE] [--emmental.lbfgs_line_search_fn EMMENTAL.LBFGS_LINE_SEARCH_FN] [--emmental.rms_prop_alpha EMMENTAL.RMS_PROP_ALPHA] [--emmental.rms_prop_eps EMMENTAL.RMS_PROP_EPS] [--emmental.rms_prop_momentum EMMENTAL.RMS_PROP_MOMENTUM] [--emmental.rms_prop_centered EMMENTAL.RMS_PROP_CENTERED] [--emmental.r_prop_etas EMMENTAL.R_PROP_ETAS [EMMENTAL.R_PROP_ETAS ...]] [--emmental.r_prop_step_sizes EMMENTAL.R_PROP_STEP_SIZES [EMMENTAL.R_PROP_STEP_SIZES ...]] [--emmental.sgd_momentum EMMENTAL.SGD_MOMENTUM] [--emmental.sgd_dampening EMMENTAL.SGD_DAMPENING] [--emmental.sgd_nesterov EMMENTAL.SGD_NESTEROV] [--emmental.sparse_adam_betas EMMENTAL.SPARSE_ADAM_BETAS [EMMENTAL.SPARSE_ADAM_BETAS ...]] [--emmental.sparse_adam_eps EMMENTAL.SPARSE_ADAM_EPS] [--emmental.bert_adam_betas EMMENTAL.BERT_ADAM_BETAS [EMMENTAL.BERT_ADAM_BETAS ...]] [--emmental.bert_adam_eps EMMENTAL.BERT_ADAM_EPS] [--emmental.lr_scheduler {linear,exponential,plateau,step,multi_step,cyclic,one_cycle,cosine_annealing}] [--emmental.lr_scheduler_step_unit {batch,epoch}] [--emmental.lr_scheduler_step_freq EMMENTAL.LR_SCHEDULER_STEP_FREQ] [--emmental.warmup_steps EMMENTAL.WARMUP_STEPS] [--emmental.warmup_unit {batch,epoch}] [--emmental.warmup_percentage EMMENTAL.WARMUP_PERCENTAGE] [--emmental.min_lr EMMENTAL.MIN_LR] [--emmental.reset_state EMMENTAL.RESET_STATE] [--emmental.exponential_lr_scheduler_gamma EMMENTAL.EXPONENTIAL_LR_SCHEDULER_GAMMA] [--emmental.plateau_lr_scheduler_metric EMMENTAL.PLATEAU_LR_SCHEDULER_METRIC] [--emmental.plateau_lr_scheduler_mode {min,max}] [--emmental.plateau_lr_scheduler_factor EMMENTAL.PLATEAU_LR_SCHEDULER_FACTOR] [--emmental.plateau_lr_scheduler_patience EMMENTAL.PLATEAU_LR_SCHEDULER_PATIENCE] [--emmental.plateau_lr_scheduler_threshold EMMENTAL.PLATEAU_LR_SCHEDULER_THRESHOLD] [--emmental.plateau_lr_scheduler_threshold_mode {rel,abs}] [--emmental.plateau_lr_scheduler_cooldown EMMENTAL.PLATEAU_LR_SCHEDULER_COOLDOWN] [--emmental.plateau_lr_scheduler_eps EMMENTAL.PLATEAU_LR_SCHEDULER_EPS] [--emmental.step_lr_scheduler_step_size EMMENTAL.STEP_LR_SCHEDULER_STEP_SIZE] [--emmental.step_lr_scheduler_gamma EMMENTAL.STEP_LR_SCHEDULER_GAMMA] [--emmental.step_lr_scheduler_last_epoch EMMENTAL.STEP_LR_SCHEDULER_LAST_EPOCH] [--emmental.multi_step_lr_scheduler_milestones EMMENTAL.MULTI_STEP_LR_SCHEDULER_MILESTONES [EMMENTAL.MULTI_STEP_LR_SCHEDULER_MILESTONES ...]] [--emmental.multi_step_lr_scheduler_gamma EMMENTAL.MULTI_STEP_LR_SCHEDULER_GAMMA] [--emmental.multi_step_lr_scheduler_last_epoch EMMENTAL.MULTI_STEP_LR_SCHEDULER_LAST_EPOCH] [--emmental.cyclic_lr_scheduler_base_lr EMMENTAL.CYCLIC_LR_SCHEDULER_BASE_LR [EMMENTAL.CYCLIC_LR_SCHEDULER_BASE_LR ...]] [--emmental.cyclic_lr_scheduler_max_lr EMMENTAL.CYCLIC_LR_SCHEDULER_MAX_LR [EMMENTAL.CYCLIC_LR_SCHEDULER_MAX_LR ...]] [--emmental.cyclic_lr_scheduler_step_size_up EMMENTAL.CYCLIC_LR_SCHEDULER_STEP_SIZE_UP] [--emmental.cyclic_lr_scheduler_step_size_down EMMENTAL.CYCLIC_LR_SCHEDULER_STEP_SIZE_DOWN] [--emmental.cyclic_lr_scheduler_mode EMMENTAL.CYCLIC_LR_SCHEDULER_MODE] [--emmental.cyclic_lr_scheduler_gamma EMMENTAL.CYCLIC_LR_SCHEDULER_GAMMA] [--emmental.cyclic_lr_scheduler_scale_mode {cycle,iterations}] [--emmental.cyclic_lr_scheduler_cycle_momentum EMMENTAL.CYCLIC_LR_SCHEDULER_CYCLE_MOMENTUM] [--emmental.cyclic_lr_scheduler_base_momentum EMMENTAL.CYCLIC_LR_SCHEDULER_BASE_MOMENTUM [EMMENTAL.CYCLIC_LR_SCHEDULER_BASE_MOMENTUM ...]] [--emmental.cyclic_lr_scheduler_max_momentum EMMENTAL.CYCLIC_LR_SCHEDULER_MAX_MOMENTUM [EMMENTAL.CYCLIC_LR_SCHEDULER_MAX_MOMENTUM ...]] [--emmental.cyclic_lr_scheduler_last_epoch EMMENTAL.CYCLIC_LR_SCHEDULER_LAST_EPOCH] [--emmental.one_cycle_lr_scheduler_max_lr EMMENTAL.ONE_CYCLE_LR_SCHEDULER_MAX_LR [EMMENTAL.ONE_CYCLE_LR_SCHEDULER_MAX_LR ...]] [--emmental.one_cycle_lr_scheduler_pct_start EMMENTAL.ONE_CYCLE_LR_SCHEDULER_PCT_START] [--emmental.one_cycle_lr_scheduler_anneal_strategy {cos,linear}] [--emmental.one_cycle_lr_scheduler_cycle_momentum EMMENTAL.ONE_CYCLE_LR_SCHEDULER_CYCLE_MOMENTUM] [--emmental.one_cycle_lr_scheduler_base_momentum EMMENTAL.ONE_CYCLE_LR_SCHEDULER_BASE_MOMENTUM [EMMENTAL.ONE_CYCLE_LR_SCHEDULER_BASE_MOMENTUM ...]] [--emmental.one_cycle_lr_scheduler_max_momentum EMMENTAL.ONE_CYCLE_LR_SCHEDULER_MAX_MOMENTUM [EMMENTAL.ONE_CYCLE_LR_SCHEDULER_MAX_MOMENTUM ...]] [--emmental.one_cycle_lr_scheduler_div_factor EMMENTAL.ONE_CYCLE_LR_SCHEDULER_DIV_FACTOR] [--emmental.one_cycle_lr_scheduler_final_div_factor EMMENTAL.ONE_CYCLE_LR_SCHEDULER_FINAL_DIV_FACTOR] [--emmental.one_cycle_lr_scheduler_last_epoch EMMENTAL.ONE_CYCLE_LR_SCHEDULER_LAST_EPOCH] [--emmental.cosine_annealing_lr_scheduler_last_epoch EMMENTAL.COSINE_ANNEALING_LR_SCHEDULER_LAST_EPOCH] [--emmental.task_scheduler EMMENTAL.TASK_SCHEDULER] [--emmental.sequential_scheduler_fillup EMMENTAL.SEQUENTIAL_SCHEDULER_FILLUP] [--emmental.round_robin_scheduler_fillup EMMENTAL.ROUND_ROBIN_SCHEDULER_FILLUP] [--emmental.mixed_scheduler_fillup EMMENTAL.MIXED_SCHEDULER_FILLUP] [--emmental.counter_unit {epoch,batch}] [--emmental.evaluation_freq EMMENTAL.EVALUATION_FREQ] [--emmental.writer {json,tensorboard}] [--emmental.checkpointing EMMENTAL.CHECKPOINTING] [--emmental.checkpoint_path EMMENTAL.CHECKPOINT_PATH] [--emmental.checkpoint_freq EMMENTAL.CHECKPOINT_FREQ] [--emmental.checkpoint_metric EMMENTAL.CHECKPOINT_METRIC] [--emmental.checkpoint_task_metrics EMMENTAL.CHECKPOINT_TASK_METRICS] [--emmental.checkpoint_runway EMMENTAL.CHECKPOINT_RUNWAY] [--emmental.checkpoint_all EMMENTAL.CHECKPOINT_ALL] [--emmental.clear_intermediate_checkpoints EMMENTAL.CLEAR_INTERMEDIATE_CHECKPOINTS] [--emmental.clear_all_checkpoints EMMENTAL.CLEAR_ALL_CHECKPOINTS] [--run_config.spawn_method RUN_CONFIG.SPAWN_METHOD] [--run_config.eval_batch_size RUN_CONFIG.EVAL_BATCH_SIZE] [--run_config.dataloader_threads RUN_CONFIG.DATALOADER_THREADS] [--run_config.log_level RUN_CONFIG.LOG_LEVEL] [--run_config.dataset_threads RUN_CONFIG.DATASET_THREADS] [--run_config.result_label_file RUN_CONFIG.RESULT_LABEL_FILE] [--run_config.result_emb_file RUN_CONFIG.RESULT_EMB_FILE] [--train_config.dropout TRAIN_CONFIG.DROPOUT] [--train_config.batch_size TRAIN_CONFIG.BATCH_SIZE] [--model_config.attn_class MODEL_CONFIG.ATTN_CLASS] [--model_config.hidden_size MODEL_CONFIG.HIDDEN_SIZE] [--model_config.num_heads MODEL_CONFIG.NUM_HEADS] [--model_config.ff_inner_size MODEL_CONFIG.FF_INNER_SIZE] [--model_config.num_model_stages MODEL_CONFIG.NUM_MODEL_STAGES] [--model_config.num_fc_layers MODEL_CONFIG.NUM_FC_LAYERS] [--model_config.custom_args MODEL_CONFIG.CUSTOM_ARGS] [--data_config.eval_slices DATA_CONFIG.EVAL_SLICES] [--data_config.train_in_candidates DATA_CONFIG.TRAIN_IN_CANDIDATES] [--data_config.data_dir DATA_CONFIG.DATA_DIR] [--data_config.data_prep_dir DATA_CONFIG.DATA_PREP_DIR] [--data_config.entity_dir DATA_CONFIG.ENTITY_DIR] [--data_config.entity_prep_dir DATA_CONFIG.ENTITY_PREP_DIR] [--data_config.entity_map_dir DATA_CONFIG.ENTITY_MAP_DIR] [--data_config.alias_cand_map DATA_CONFIG.ALIAS_CAND_MAP] [--data_config.emb_dir DATA_CONFIG.EMB_DIR] [--data_config.max_seq_len DATA_CONFIG.MAX_SEQ_LEN] [--data_config.max_aliases DATA_CONFIG.MAX_ALIASES] [--data_config.overwrite_preprocessed_data DATA_CONFIG.OVERWRITE_PREPROCESSED_DATA] [--data_config.type_prediction.use_type_pred DATA_CONFIG.TYPE_PREDICTION.USE_TYPE_PRED] [--data_config.type_prediction.file DATA_CONFIG.TYPE_PREDICTION.FILE] [--data_config.type_prediction.num_types DATA_CONFIG.TYPE_PREDICTION.NUM_TYPES] [--data_config.type_prediction.dim DATA_CONFIG.TYPE_PREDICTION.DIM] [--data_config.train_dataset.file DATA_CONFIG.TRAIN_DATASET.FILE] [--data_config.train_dataset.use_weak_label DATA_CONFIG.TRAIN_DATASET.USE_WEAK_LABEL] [--data_config.dev_dataset.file DATA_CONFIG.DEV_DATASET.FILE] [--data_config.dev_dataset.use_weak_label DATA_CONFIG.DEV_DATASET.USE_WEAK_LABEL] [--data_config.test_dataset.file DATA_CONFIG.TEST_DATASET.FILE] [--data_config.test_dataset.use_weak_label DATA_CONFIG.TEST_DATASET.USE_WEAK_LABEL] [--data_config.word_embedding.bert_model DATA_CONFIG.WORD_EMBEDDING.BERT_MODEL] [--data_config.word_embedding.use_sent_proj DATA_CONFIG.WORD_EMBEDDING.USE_SENT_PROJ] [--data_config.word_embedding.layers DATA_CONFIG.WORD_EMBEDDING.LAYERS] [--data_config.word_embedding.freeze DATA_CONFIG.WORD_EMBEDDING.FREEZE] [--data_config.word_embedding.cache_dir DATA_CONFIG.WORD_EMBEDDING.CACHE_DIR] [--data_config.ent_embeddings DATA_CONFIG.ENT_EMBEDDINGS] process_STS_dev.py: error: unrecognized arguments: --data_config.context_mask_perc 0.0 --data_config.entity_kg_data {"kg_labels":"kg_mappings\/qid2relations.json","kg_vocab":"kg_mappings\/relation_vocab.json","use_entity_kg":true} --data_config.entity_type_data {"type_labels":"type_mappings\/wiki\/qid2typeids.json","type_vocab":"type_mappings\/wiki\/type_vocab.json","use_entity_types":true} --data_config.max_ent_len 128 --data_config.max_seq_window_len 64 --data_config.use_entity_desc True --data_config.word_embedding.context_layers 6 --data_config.word_embedding.entity_layers 6 --emmental.n_steps 428648 --emmental.write_loss_per_step True --model_config.normalize True --model_config.temperature 0.01

Process finished with exit code 2

lorr1 commented 3 years ago

What is process_STS_dev? Also, how did you install bootleg? Did you use pip? Please try a direct install from github as the most recent master branch is the one that works with the downloaded models. We are in a prerelease with the models and our pip is out of date.

lshowway commented 3 years ago

@lorr1 Yes, I install it from pip, and I am trying install it from github. But is it should be python setup.py install in git clone git@github.com:HazyResearch/bootleg bootleg cd bootleg python3 setup.py

lorr1 commented 3 years ago

Good catch. Yes it should be python3 setup.py install. I will correct that on the README.md.