abugasavio / piianalyzer

A iHub Summer 2015 project
10 stars 10 forks source link

First run #9

Closed krywykj closed 6 years ago

krywykj commented 7 years ago

Hello,

First of all, thank you for the work you did, I'm using it for looking at other PII detection!

I'm trying to run the code, but I have the following error :

from piianalyzer.analyzer import PiiAnalyzer
filepath = 'datasetPIIinternationaux.csv'
piianalyzer = PiiAnalyzer(filepath)
analysis = piianalyzer.analysis()

---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-1-fa7f427d3466> in <module>()
      1 from piianalyzer.analyzer import PiiAnalyzer
      2 filepath = 'datasetPIIinternationaux.csv'
----> 3 piianalyzer = PiiAnalyzer(filepath)
      4 analysis = piianalyzer.analysis()

c:\users\mehdi\miniconda3\lib\site-packages\piianalyzer\analyzer.py in __init__(self, filepath)
      8         self.filepath = filepath
      9         self.parser = CommonRegex()
---> 10         self.standford_ner = StanfordNERTagger('classifiers/english.conll.4class.distsim.crf.ser.gz')
     11 
     12     def analysis(self):

c:\users\mehdi\miniconda3\lib\site-packages\nltk\tag\stanford.py in __init__(self, *args, **kwargs)
    173 
    174     def __init__(self, *args, **kwargs):
--> 175         super(StanfordNERTagger, self).__init__(*args, **kwargs)
    176 
    177     @property

c:\users\mehdi\miniconda3\lib\site-packages\nltk\tag\stanford.py in __init__(self, model_filename, path_to_jar, encoding, verbose, java_options)
     56                 self._JAR, path_to_jar,
     57                 searchpath=(), url=_stanford_url,
---> 58                 verbose=verbose)
     59 
     60         self._stanford_model = find_file(model_filename,

c:\users\mehdi\miniconda3\lib\site-packages\nltk\__init__.py in find_jar(name_pattern, path_to_jar, env_vars, searchpath, url, verbose, is_regex)
    719         searchpath=(), url=None, verbose=False, is_regex=False):
    720     return next(find_jar_iter(name_pattern, path_to_jar, env_vars,
--> 721                          searchpath, url, verbose, is_regex))
    722 
    723 

c:\users\mehdi\miniconda3\lib\site-packages\nltk\__init__.py in find_jar_iter(name_pattern, path_to_jar, env_vars, searchpath, url, verbose, is_regex)
    714                     (name_pattern, url))
    715         div = '='*75
--> 716         raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
    717 
    718 def find_jar(name_pattern, path_to_jar=None, env_vars=(),

LookupError: 

===========================================================================
  NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
  environment variable.

  For more information, on stanford-ner.jar, see:
    <https://nlp.stanford.edu/software>
===========================================================================

Any idea to be able to run the code ?

Best wishes, Julien

abugasavio commented 7 years ago

Hello Julien, This project requires the Stanford Named Entity Recognizer. Please download it here