Closed khalludi closed 6 years ago
Khalid,
Please refer to issue #20 for a fix to your problem. In short: you need to configure your OS to allow more file descriptors to be open. Feel free to re-open this issue if that doesn't work.
That would seem to fix it. Thanks for the reply.
I am posting this response as an alternative to creating larger file limits on the operating system.
I decided not to mess with kernel level settings since I am still in an early phase and the files I am scanning are relatively small. Instead, I chose to break the document into smaller portions and then match. In short, the code attempts to match the given line, but if it encounters an error then it splits the line in half and does two separate matches recursively.
def quick_match(matcher, line):
ret = []
try:
tmp = matcher.match(line, best_match=True, ignore_syntax=False)
ret.append(tmp)
except:
ret.append(quick_match(matcher, line[:int(len(line)/2)]))
ret.append(quick_match(matcher, line[int(len(line)/2):]))
return ret
EDIT*** - When using this method, the output is mostly unusable without flattening the output first.
I am trying to use quickUMLS to classify a document. I try splitting the document by paragraph and then matching. It works for some documents and not in others. The main error that I get if it does not work is:
Choosing a smaller amount of characters seems to fix the problem. This is my matcher initialization:
matcher = QuickUMLS("/Users/khalid/prog/work/quickUMLS", window=20)
And this is a sample call:
match = matcher.match(line, best_match=True, ignore_syntax=False)
I'm guessing that if there are too many cuis in one section of the text, then an error is shown. It would be a nice feature if the matcher automatically split the amount of text to match all of the cuis without error instead of having the user come up with a solution themselves.