Open bennytieu opened 7 years ago
Can you isolate (and post) the document which causes the error message?
I have isolated it to this string:
Contact for company: Sven Svensson 212 584 5242 sven.svensson@email.com.
I'm guessing it is the sequence of number that is at fault. Single instances of numbers are ok, for example, there are years like 2017 in other documents that are fine.
This example works:
Contact for company: Sven Svensson 584 5242 sven.svensson@email.com.
I did some debugging, the first example is tokenized as ['Contact', 'for', 'company', ':', 'Sven', 'Svensson', '212Â\xa0584Â\xa05242', 'sven.svensson@email.com', '.']
. I suspect that the TypeError
happens because some representation I rely on handles the numbers as individual tokens. I will not be able to fix this right now, is using Python2 an option for you?
I will try and run on Python2 in the meantime or just skip this special case. I'm doing a study on efficiency, so it would be most optimal to run it using Python3. Thank you for your quick reply!
I was trying to run cort-predict-raw with following command:
and got the following error message:
It works without a problem with python2 though. I'm running this on Ubuntu16.04.