usnistgov / nestor-tmp2

Quantifying tacit knowledge for investigatory analysis
Other
9 stars 5 forks source link

fail when NLPing some header of csv #30

Closed saschaMoccozet closed 6 years ago

saschaMoccozet commented 6 years ago

/nestor/nestor/keyword.py", line 499, in voc2.loc[mask, 'NE'] = voc2.loc[mask, 'NE'].apply(lambda x: NE_map[x]) # special logic for custom NE type-combinations (config.yaml) KeyError: 'U UlureU'

The key error changes based on the header opened but same file same issue

rtbs-dev commented 6 years ago

Solved in 928a1f38751dc3518cfe1e3c5e512a287a494d13, but a more robust regex function for n-gram substitution is 100% necessary in the future.