dkpro / dkpro-cassis

UIMA CAS processing library written in Python
https://pypi.org/project/dkpro-cassis/
Apache License 2.0
84 stars 22 forks source link

Accessing `.type` attribute leads to gigantic output due to recursion #236

Closed DavidHuebner closed 2 years ago

DavidHuebner commented 2 years ago

Describe the bug The attribute .type of an annotation seems to be calling an infinite recursion loop instead of printing the simple type name. This is quite a severe bug as this prevents us not only from accessing annotation.type, but also from just printing an annotation.

To Reproduce Steps to reproduce the behavior:

  1. Download the attached very simple CAS and the according typesystem.
  2. Execute the following code snippet
    
    from cassis import *

with open('typesystem.xml', 'rb') as f: typesystem = load_typesystem(f)

with open('test.xmi', 'rb') as f: cas = load_cas_from_xmi(f, typesystem=typesystem)

doc = cas.select("uima.tcas.DocumentAnnotation")[0] print(len(str(doc.type))) # 26350638 --> OUCH

print(doc) --> might crash your application



**Expected behavior**
We should be able to access the `.type` feature safely.

**Error message**
![image](https://user-images.githubusercontent.com/14200897/141788948-71a17923-0bd8-4eff-8ed1-a68851f811da.png)

**Please complete the following information:**
 - Version: 0.6.1
 - OS: Ubuntu 20.04

**Additional context**
Files: [files.zip](https://github.com/dkpro/dkpro-cassis/files/7538980/files.zip)
reckart commented 2 years ago

How about using type.name?

DavidHuebner commented 2 years ago

This works in principle, but we should guarantee that the Type object has a proper string representation when called. Especially since calling str(annotation) (or print(annotation)) will always call str(annotation.type) and lead to this kind of messy output.

So maybe adding something like the following representation to the Type class will already do the job.

def __str__(self):
   return f"Type(name={self.name})"
jcklie commented 2 years ago

We added your suggested change and it will be in the next release.