jflanigan / jamr

JAMR Parser and Generator
BSD 2-Clause "Simplified" License
193 stars 50 forks source link

No output for PARSE.sh #13

Closed nahgnaw closed 8 years ago

nahgnaw commented 8 years ago

Hi, I was trying to run scripts/PARSE.sh < input_file > output_file 2> output_file.err with just one test sentence in the input file, but there was nothing in the output file, and in the log file it had the following:

 ### Tokenizing input ###
Unicode character 0xfdd3 is illegal at /home/nahgnaw/jamr/tools/cdec/corpus/support/quote-norm.pl line 56.
 ### Running NER system ###
~/jamr/tools/IllinoisNerExtended ~/jamr
Adding feature: Forms
Adding feature: Capitalization
Adding feature: WordTypeInformation
Adding feature: Affixes
Adding feature: PreviousTag1
Adding feature: PreviousTag2
Adding feature: PreviousTagPatternLevel1
Adding feature: PreviousTagPatternLevel2
Adding feature: PrevTagsForContext
Adding feature: PredictionsLevel1
Adding feature: GazetteersFeatures
Adding feature: BrownClusterPaths
Loading gazetteers....
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    loading gazetteer:....ner-ext/KnownLists/WikiPeople.lst
    loading gazetteer:....ner-ext/KnownLists/ordinalNumber.txt
    loading gazetteer:....ner-ext/KnownLists/WikiSongs.lst
    loading gazetteer:....ner-ext/KnownLists/WikiManMadeObjectNames.lst
    loading gazetteer:....ner-ext/KnownLists/WikiArtWorkRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/known_name.lst
    loading gazetteer:....ner-ext/KnownLists/Occupations.txt
    loading gazetteer:....ner-ext/KnownLists/WikiLocations.lst
    loading gazetteer:....ner-ext/KnownLists/known_state.lst
    loading gazetteer:....ner-ext/KnownLists/WikiCompetitionsBattlesEventsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiOrganizationsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/known_nationalities.lst
    loading gazetteer:....ner-ext/KnownLists/WikiManMadeObjectNamesRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiSongsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/cardinalNumber.txt
    loading gazetteer:....ner-ext/KnownLists/currencyFinal.txt
    loading gazetteer:....ner-ext/KnownLists/known_names.big.lst
    loading gazetteer:....ner-ext/KnownLists/known_jobs.lst
    loading gazetteer:....ner-ext/KnownLists/known_title.lst
    loading gazetteer:....ner-ext/KnownLists/WikiFilmsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/temporal_words.txt
    loading gazetteer:....ner-ext/KnownLists/measurments.txt
    loading gazetteer:....ner-ext/KnownLists/known_place.lst
    loading gazetteer:....ner-ext/KnownLists/known_country.lst
    loading gazetteer:....ner-ext/KnownLists/known_corporations.lst
    loading gazetteer:....ner-ext/KnownLists/WikiOrganizations.lst
    loading gazetteer:....ner-ext/KnownLists/VincentNgPeopleTitles.txt
    loading gazetteer:....ner-ext/KnownLists/WikiFilms.lst
    loading gazetteer:....ner-ext/KnownLists/WikiLocationsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiArtWork.lst
    loading gazetteer:....ner-ext/KnownLists/WikiPeopleRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiCompetitionsBattlesEvents.lst
    loading gazetteer:....ner-ext/KnownLists/KnownNationalities.txt
found 33 gazetteers
1288301 words added
95262 words added
85963 words added

Working parameters are:
    inferenceMethod=GREEDY
    beamSize=5
    thresholdPrediction=false
    predictionConfidenceThreshold=-1.0
    labelTypes
        PER     ORG     LOC     MISC
    logging=false
    debuggingLogPath=null
    forceNewSentenceOnLineBreaks=true
    keepOriginalFileTokenizationAndSentenceSplitting=false
    taggingScheme=BILOU
    tokenizationScheme=DualTokenizationScheme
    pathToModelFile=data/Models/CoNLL/finalSystemBILOU.model
Brown clusters resource:
    -Path: brown-clusters/brown-english-wikitext.case-intact.txt-c1000-freq10-v3.txt
    -WordThres=5
    -IsLowercased=false
Brown clusters resource:
    -Path: brown-clusters/brownBllipClusters
    -WordThres=5
    -IsLowercased=false
Brown clusters resource:
    -Path: brown-clusters/brown-rcv1.clean.tokenized-CoNLL03.txt-c1000-freq1.txt
    -WordThres=5
    -IsLowercased=false

Tagging file: /tmp/jamr-25472.snt.tmp
Reading model file : data/Models/CoNLL/finalSystemBILOU.model.level1
Reading model file : data/Models/CoNLL/finalSystemBILOU.model.level2
Extracting features for level 2 inference
Done - Extracting features for level 2 inference
~/jamr
nahgnaw@el:~/jamr/scripts$ vi /home/nahgnaw/jamr/tools/cdec/corpus/support/quote-norm.pl
nahgnaw@el:~/jamr/scripts$
nahgnaw@el:~/jamr/scripts$ vi PARSE
nahgnaw@el:~/jamr/scripts$ vi PARSE
PARSE_IT.sh  PARSE.sh
nahgnaw@el:~/jamr/scripts$ vi PARSE.sh
nahgnaw@el:~/jamr/scripts$ cat ../data/test.txt.err
 ### Tokenizing input ###
Unicode character 0xfdd3 is illegal at /home/nahgnaw/jamr/tools/cdec/corpus/support/quote-norm.pl line 56.
 ### Running NER system ###
~/jamr/tools/IllinoisNerExtended ~/jamr
Adding feature: Forms
Adding feature: Capitalization
Adding feature: WordTypeInformation
Adding feature: Affixes
Adding feature: PreviousTag1
Adding feature: PreviousTag2
Adding feature: PreviousTagPatternLevel1
Adding feature: PreviousTagPatternLevel2
Adding feature: PrevTagsForContext
Adding feature: PredictionsLevel1
Adding feature: GazetteersFeatures
Adding feature: BrownClusterPaths
Loading gazetteers....
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    loading gazetteer:....ner-ext/KnownLists/WikiPeople.lst
    loading gazetteer:....ner-ext/KnownLists/ordinalNumber.txt
    loading gazetteer:....ner-ext/KnownLists/WikiSongs.lst
    loading gazetteer:....ner-ext/KnownLists/WikiManMadeObjectNames.lst
    loading gazetteer:....ner-ext/KnownLists/WikiArtWorkRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/known_name.lst
    loading gazetteer:....ner-ext/KnownLists/Occupations.txt
    loading gazetteer:....ner-ext/KnownLists/WikiLocations.lst
    loading gazetteer:....ner-ext/KnownLists/known_state.lst
    loading gazetteer:....ner-ext/KnownLists/WikiCompetitionsBattlesEventsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiOrganizationsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/known_nationalities.lst
    loading gazetteer:....ner-ext/KnownLists/WikiManMadeObjectNamesRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiSongsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/cardinalNumber.txt
    loading gazetteer:....ner-ext/KnownLists/currencyFinal.txt
    loading gazetteer:....ner-ext/KnownLists/known_names.big.lst
    loading gazetteer:....ner-ext/KnownLists/known_jobs.lst
    loading gazetteer:....ner-ext/KnownLists/known_title.lst
    loading gazetteer:....ner-ext/KnownLists/WikiFilmsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/temporal_words.txt
    loading gazetteer:....ner-ext/KnownLists/measurments.txt
    loading gazetteer:....ner-ext/KnownLists/known_place.lst
    loading gazetteer:....ner-ext/KnownLists/known_country.lst
    loading gazetteer:....ner-ext/KnownLists/known_corporations.lst
    loading gazetteer:....ner-ext/KnownLists/WikiOrganizations.lst
    loading gazetteer:....ner-ext/KnownLists/VincentNgPeopleTitles.txt
    loading gazetteer:....ner-ext/KnownLists/WikiFilms.lst
    loading gazetteer:....ner-ext/KnownLists/WikiLocationsRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiArtWork.lst
    loading gazetteer:....ner-ext/KnownLists/WikiPeopleRedirects.lst
    loading gazetteer:....ner-ext/KnownLists/WikiCompetitionsBattlesEvents.lst
    loading gazetteer:....ner-ext/KnownLists/KnownNationalities.txt
found 33 gazetteers
1288301 words added
95262 words added
85963 words added

Working parameters are:
    inferenceMethod=GREEDY
    beamSize=5
    thresholdPrediction=false
    predictionConfidenceThreshold=-1.0
    labelTypes
        PER     ORG     LOC     MISC
    logging=false
    debuggingLogPath=null
    forceNewSentenceOnLineBreaks=true
    keepOriginalFileTokenizationAndSentenceSplitting=false
    taggingScheme=BILOU
    tokenizationScheme=DualTokenizationScheme
    pathToModelFile=data/Models/CoNLL/finalSystemBILOU.model
Brown clusters resource:
    -Path: brown-clusters/brown-english-wikitext.case-intact.txt-c1000-freq10-v3.txt
    -WordThres=5
    -IsLowercased=false
Brown clusters resource:
    -Path: brown-clusters/brownBllipClusters
    -WordThres=5
    -IsLowercased=false
Brown clusters resource:
    -Path: brown-clusters/brown-rcv1.clean.tokenized-CoNLL03.txt-c1000-freq1.txt
    -WordThres=5
    -IsLowercased=false

Tagging file: /tmp/jamr-25472.snt.tmp
Reading model file : data/Models/CoNLL/finalSystemBILOU.model.level1
Reading model file : data/Models/CoNLL/finalSystemBILOU.model.level2
Extracting features for level 2 inference
Done - Extracting features for level 2 inference
~/jamr

Was it because the unicode error? or there is something else I'm missing?

Thanks!

jflanigan commented 8 years ago

It could be because of the unicode error. What was the sentence that caused the error, and does it happen with other sentences?

nahgnaw commented 8 years ago

It's just some random English sentence I typed (e.g. "The president spoke in March.").

nahgnaw commented 8 years ago

I made it work on another machine.