MS20190155 / Measuring-Corporate-Culture-Using-Machine-Learning

Code Repository for MS20190155
135 stars 97 forks source link

Issues while running parse.py #1

Closed rajexplo closed 4 years ago

rajexplo commented 4 years ago

Hi I was trying to run the code parse.py and threw the backtrace like local variable 'sentences_processed' referenced before assignment. It does the processing till certain extent then throw this error after few processing. I tried with multiprocess and single core, bot. Any help will be appreciated.

Thanks!

maifeng commented 4 years ago

If you set chunk_size = 10 in line 41 of parse.py, does it run pass the first 10 lines of the example inputs?

If not, you may want to check if the path to CoreNLP is set correctly, or if Java is installed, or if you have sufficient permission to start the server (e.g. try sudo parse.py). Are you able to get the corenlp client tutorial code working?

Another thing to check is your max document length in characters (if not using the example inputs). maxCharLength is set to be 100000 by default in CoreNLP. You can change the setting in the main function:

    with CoreNLPClient(
        properties={
            "ner.applyFineGrained": "false",
            "annotators": "tokenize, ssplit, pos, lemma, ner, depparse",
        },
        memory=global_options.RAM_CORENLP,
        threads=global_options.N_CORES,
        timeout=12000000,
        max_char_length=1000000,
    ) as client:
rajexplo commented 4 years ago

Thanks! It worked! The issue was in StanforCoreNLP setting! Thanks!