Open babak-kananpour opened 2 years ago
Hi, I just ran the notebook again but I didn't encounter the same issue, could you please try again with the latest notebook in the readme file? If the error persists, I can look deeper into this.
Best regards,
Hi @xzyaoi , I still have the same problem with the new version. In the demo notebook when the line:
importances = importance.fit(X_train_dirty, y_train_dirty).score(X_test, y_test)
gets to run I face the error. what I did to ignore this problem was to use the python written function of "compute_all_importances" in "datascope/importance/shapley" instead of cython version "compute_all_importances_cy".
These variables are float "unit_distances, unit_utilities, null_scores" which is correct but shapley_cy.pyx expect these variable to be integer.
@babak-1990 Interesting, I still cannot reproduce this error, even with a newly created colab environment (see https://colab.research.google.com/drive/1RdArqm0ZpYR_Tq5rKMDu8U7KxsgNEhNl#scrollTo=8b974636-7c3e-4b82-8401-ff541a47a002).
I am now thinking this is due to your local compiler, which may have a different behavior about np.int
(are you on Windows or Mac OS?). I have found a possible solution: https://github.com/eragonruan/text-detection-ctpn/issues/380
However, I don't have a Windows PC at hand, could you please try to change the np.int
to np.int64
or np.int32
(if np.int64
does not work out) in this Line https://github.com/easeml/datascope/blob/main/datascope/importance/shapley_cy.pyx#L30? Then after re-compiling, it should work.
If it works please let me know so I can release a stable fix on this. If it doesn't please also feel free to reach out!
Best regards, Xiaozhe
Hi @xzyaoi, I already read this potential solution and I tried to fix it by assigning different DTYPE but it didn't work out. My local computer OS is windows. you are right this is due to my local compiler. Changing this line https://github.com/easeml/datascope/blob/main/datascope/importance/shapley_cy.pyx#L30? won't fix the problem because the error happens before entering function compute_all_importances_cy
in /datascope/importance/shapley_cy.pyx , however I gave it a try to be sure about it.
In that time, I though maybe changing lines https://github.com/easeml/datascope/blob/8a8e397686df318e2e3e5c32b30ba4b80244c522/datascope/importance/shapley_cy.pyx?rgh-link-date=2022-06-17T09%3A39%3A48Z#L11 and https://github.com/easeml/datascope/blob/8a8e397686df318e2e3e5c32b30ba4b80244c522/datascope/importance/shapley_cy.pyx?rgh-link-date=2022-06-17T09%3A39%3A48Z#L12 will fix the problem but it didn't.
@babak-1990, Thanks for your reply! I can reproduce this error with a Windows Setup. I am trying to fix this error, and it should be soon. I will update here if I made a progress :)
Thanks for this great library. I followed the instruction in readme.md file and run the setup.
I get the following error when trying to test on the notebook "DataScope-Demo-1.ipynb":
It seems there is an issue with type when calling the compute_all_importances_cy function. It expects integer but receives float(double?).
I tried to modify compute_all_importances_cy in shapley_cy.pyx but I had no luck to fix this bug.