allenai / kb

KnowBert -- Knowledge Enhanced Contextual Word Representations
Apache License 2.0
371 stars 50 forks source link

Evaluate Perplexity: Issue when constructing DatasetReader from parameters (AWS api) #5

Closed victorelkjaer closed 4 years ago

victorelkjaer commented 4 years ago

Dear @matt-peters et al., First of all: Impressive work with KnowBert!

I'd like to replicate the results based on the pretrained model and as a start: evaluate the perplexity of KnowBert-wiki.

I have followed the setup Getting Started section, downloaded the held out wiki book corpus and
set up everything in a submit script almost identical to

MODEL_ARCHIVE=..location of model
HELDOUT_FILE=wikipedia_bookscorpus_knowbert_heldout.txt
python bin/evaluate_perplexity.py -m $MODEL_ARCHIVE -e $HELDOUT_FILE

Having done that, I run the file, but experience a bug when I try to construct the DatasetReader from parameters after having loaded the model from the pretrained model archive knowbert_wiki_model.tar.gz. See error message below:

File "bin/evaluate_perplexity.py", line 72, in <module>
    random_candidates=False)
  File "bin/evaluate_perplexity.py", line 38, in run_evaluation
    reader = DatasetReader.from_params(Params(reader_params))
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 289, in from_params
    return subclass.from_params(params=params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 300, in from_params
    kwargs = create_kwargs(cls, params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 159, in create_kwargs
    kwargs[name] = annotation.from_params(params=subparams, **subextras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 289, in from_params
    return subclass.from_params(params=params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 300, in from_params
    kwargs = create_kwargs(cls, params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 159, in create_kwargs
    kwargs[name] = annotation.from_params(params=subparams, **subextras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 289, in from_params
    return subclass.from_params(params=params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 300, in from_params
    kwargs = create_kwargs(cls, params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 194, in create_kwargs
    value_dict[key] = value_cls.from_params(params=value_params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 289, in from_params
    return subclass.from_params(params=params, **extras)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/from_params.py", line 302, in from_params
    return cls(**kwargs)  # type: ignore
  File "/zhome/9e/7/97809/thesis/final-project-02456/kb-master/kb/wiki_linking_util.py", line 152, in __init__
    entity_world_path = cached_path(entity_world_path or self.defaults["entity_world_path"])
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/file_utils.py", line 98, in cached_path
    return get_from_cache(url_or_filename, cache_dir)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/file_utils.py", line 194, in get_from_cache
    etag = s3_etag(url)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/file_utils.py", line 142, in wrapper
    return func(url, *args, **kwargs)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/allennlp/common/file_utils.py", line 158, in s3_etag
    return s3_object.e_tag
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/boto3/resources/factory.py", line 339, in property_loader
    self.load()
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/boto3/resources/factory.py", line 505, in do_action
    response = action(self, *args, **kwargs)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/zhome/9e/7/97809/miniconda3/envs/knowbert/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Before I experienced this error, I got a botocore.exceptions.NoCredentialsError: Unable to locate credentials error similar to the issue presented here: https://github.com/spulec/moto/issues/1941. But after I created an aws configuration, I got to

botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Have you experienced this before? is the issue that I do not have access to the data requested in the queried s3 aws bucket?

I thank you for your time!

Best regards, Victor

matt-peters commented 4 years ago

The object should be public, but you may need an S3 account to access. The files can also be downloaded via https. Does the mp/s3 branch fix the issue?

victorelkjaer commented 4 years ago

The mp/s3 branch does fix the issue, yes. Thanks @matt-peters