Open KimBenjaminTang opened 1 year ago
The same is applicable with other special characters, such as ü,ö,ä.
And I don't exactly know how the strings are being processed, but "Croé T
" breaks it too, while "Croé
" or "Croe T
" pass.
example_text = """Croé T"""
args = "-AI -R SNOMEDCT_US_2022_03_01 --JSONf 2 -V USAbase -Z 2022AA"
inst = Submission(email, apikey)
inst.init_mm_interactive(example_text, args=args)
response = inst.submit()
Breaking here refers to the incomplete JSON at the end, ending on "UttText": [
So this is also fixable by removing the "é" but perhaps it leads in some cases to a loss of valuable information.
It also breaks with the String m² T
due to the character ²
followed by another character/word. If the string contains the ²
at the end with nothing following other than a whitespace, it gets processed:
Hello, I am trying to let MetaMap process some translated german texts, which include words with the letter 'ß'.
After analyzing why the JSON output breaks, I found out that the character 'ß' seems to cause an error, if it is included in a word (not a standalone character).
Example request:
When I decode the content of the response via response.content.decode(), it returns a broken JSON string (broken, since it does not clsoe at the end and seems cut off):
Somewhat of fix would be possible by replacing the character 'ß' with 'ss' to avoid this issue, but I am not sure if the results will be the same as with the online version of MetaMap, since words containing 'ß' are not a problem there:
Request:
User Information: fu-sung.kim-benjamin.tang@rwth-aachen.de Run Time: 12/06/2022 06:12:29
MetaMap Version Used: metamap20 MetaMap Options: -A+ -R SNOMEDCT_US_2022_03_01 --JSONf 2 -V USAbase Knowledge Source Used: 2022AA
Input Text:
This is a test with Straße --Output:
Can this be fixed by adjusting the MetaMap API to match the procedure of the MetaMap Online version?