hsnr-gamera / gamera-4

Gamera 4 for Python 3
GNU General Public License v2.0
10 stars 8 forks source link

knnmodule.hpp: Type Error raised when id is a string #69

Open malajvan opened 1 year ago

malajvan commented 1 year ago

I'm unsure if this is an issue and would like to get clarifications. We are using gamera-4 to classify 2 xml inputs using the Interactive Classifier. This error was raised when classify_with_images was called. In knncoremodule.cpp the function knn_classify_with_images is calling image_get_id_name where it throws the error "knn: could not get string from id_name tuple." on line 90. More specifically, this one:

Py_ssize_t lenTmp = 0;
  if(PyBytes_AsStringAndSize(id, id_name, &lenTmp) < 0){
      PyErr_SetString(PyExc_TypeError, "knn: could not get string from id_name tuple.");
      return -1;
  }

Further investigation suggested that the error is being thrown when id (which is the class_name as I understand) is not of type bytes and we worked around this error by converting it from PyUnicode to a PyBytes object. My main question is was id intended to be a PyBytes object? I looked through core.py and under classify_manual and classify_automatic (line 341 and 367), id_name was checked to see whether it is a string or unicode object with the util function (line 359). If i misunderstood the how this works I would really appreciate it if you can correct me. Thank you very much!

cdalitz commented 1 year ago

Thanks for pointing this out. In python2, the id was of type string which no longer exists in python3 and can be replaced with either Unicode or Bytes. Apparently during porting Gamera to python3, the decision was made to use Bytes for the id. I have just noted that this causes other problems, too, e.g. when calling id.startswith with a plain string.

Where did you add your workaround of converting the id to a Bytes object? In your copy of teh gamera code? In your application code?

If you can provide a simple test script and simple test data that causes the problem, this would be helpful.

malajvan commented 1 year ago

We did the fix on our own fork, gamera4-rodan on the branch fix_IC. The fix is in the file knnmodule.hpp under this commit. The error came from our Interactive Classifier python app which calls classify_glyph_automatic in classify.py in gamera4. You can see more of the issue here I'll try to write a simple test case.

cdalitz commented 1 year ago

Did you mange to create a small test case that reproduces the error? I think it would be best to change the id to be Unicode instead of Bytes, but I would need a test case that reproduces the bug and check whether it actually goes away when the changes are made.

malajvan commented 1 year ago

I have trouble building and testing gamera on my local environment so creating a test script is a bit difficult at the moment. We did experience some issue with another job that uses the same function image_get_id_name where it is in fact passing in a byte object (that's mean gamera is passing in bytes sometimes and string at other). We got around the issue with an if else to check if the passed in names are string or bytes then convert accordingly.

FriedrichFroebel commented 1 year ago

I have trouble building and testing gamera on my local environment so creating a test script is a bit difficult at the moment.

Are there any details on your issues? Maybe there is some guidance available on this.