Open faterazer opened 1 year ago
Hi @faterazer, thanks for reaching out! We use whatever the most current version of SKL is, so right now 1.2.1.
Was your model trained on the same version of scikit-learn that you're trying to use Hummingbird with? Just trying to make sure it's not a simple fix. (Lots of times, users have issues if the model is trained with an older version of SKL and then they call Hummingbird on a saved model.)
Can you post a little bit of your code so we can take a look? Maybe we need to add the new field.
Hi, so appreciated your suggestions, I read the letter and checked through my operations. Unfortunately, the problem still exists. I guess providing more details could be convenient for you to locate the problem. So I post my code and test data, and they are all in test.zip. Now, let me describe my processing flow:
1. In test.zip, I constructed some data for test, they all categorical features, fifteen columns in total. I saved data as test/test.csv .
Could you look into my operations and codes? Did I make a mistake in any step? Or is there a solution to fix the problem? I appreciate your reading and efforts.
Thanks again for all your work in hummingbird-ml. It's an awesome project, and I hope I could use it all the time.
Yours sincerely, faterazer
发件人: Karla Saur @.> 发送时间: 2023年2月10日 4:47 收件人: microsoft/hummingbird @.> 抄送: fater @.>; Mention @.> 主题: Re: [microsoft/hummingbird] [sklearn] OneHotEncoder does't work correctly (Issue #684)
Hi @faterazerhttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffaterazer&data=05%7C01%7C%7C2ad26099a78349f23fcb08db0adee93a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638115724754805942%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LQMqlDk9H7kSbEwZB2hloKbLmfkTsCQqReSC2kREe8U%3D&reserved=0, thanks for reaching out! We use whatever the most current version of SKL is, so right now 1.2.1.
Was your model trained on the same version of scikit-learn that you're trying to use Hummingbird with? Just trying to make sure it's not a simple fix. (Lots of times, users have issues if the model is trained with an older version of SKL and then they call Hummingbird on a saved model.)
Can you post a little bit of your code so we can take a look? Maybe we need to add the new field.
― Reply to this email directly, view it on GitHubhttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fhummingbird%2Fissues%2F684%23issuecomment-1424813248&data=05%7C01%7C%7C2ad26099a78349f23fcb08db0adee93a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638115724754805942%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Op7tq2w8p4yPrT7Dfspe9IrXWX4MxvkVq3GzhEQ0X3s%3D&reserved=0, or unsubscribehttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADIJWXPKKYMTUS3NO7SBTOLWWVJXNANCNFSM6AAAAAAUWROEPA&data=05%7C01%7C%7C2ad26099a78349f23fcb08db0adee93a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638115724754805942%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9OSe%2BWzec7QbtxCwlk%2B5x2pTr2mOWg4kKjAnJDEGtvQ%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Hello! I think that the attachment (test.zip) got dropped. If it's easier, you could check them into a fork in github and put a link!
Hello! I think that the attachment (test.zip) got dropped. If it's easier, you could check them into a fork in github and put a link! test.zip How about this time? I reply directly through Github.
Thank you for your in-depth example with details! I was able to reproduce everything you said.
Yes it looks like we need to add this feature to the list of supported options (and we should at least be putting an error for ones we don't support). We'll add that to the queue!
So glad my example helped. I really hope that the problem could be solved in the near future. Thanks your efforts. 🙂
发件人: Karla Saur @.> 发送时间: 2023年2月15日 8:54 收件人: microsoft/hummingbird @.> 抄送: fater @.>; Mention @.> 主题: Re: [microsoft/hummingbird] [sklearn] OneHotEncoder does't work correctly (Issue #684)
Thank you for your in-depth example with details! I was able to reproduce everything you said.
Yes it looks like we need to add this feature to the list of supported options (and we should at least be putting an error for ones we don't support). We'll add that to the queue!
— Reply to this email directly, view it on GitHubhttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fhummingbird%2Fissues%2F684%23issuecomment-1430596819&data=05%7C01%7C%7C0d6ed3660585437bdd1b08db0eef3aba%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638120192875740336%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=21VB5RdPUqcpu1R%2FOUE%2FQPnLaDKk8mLEVjnrgys4e3o%3D&reserved=0, or unsubscribehttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADIJWXIKM3SU35SDXSV23HLWXQSNJANCNFSM6AAAAAAUWROEPA&data=05%7C01%7C%7C0d6ed3660585437bdd1b08db0eef3aba%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638120192875896584%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jkHWjTpAs1PiI9g%2FBgIRIDfY8MsersmFWT%2FTQRAk7Pc%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Hello, I found this project last week, and thanks for all of these work.
I installed
Hummingbird-ml==0.47
by pip, and I want to know which version of sklearn should I use.I want to use one-hot encoder of sklearn to preprocess my categorical features, but the result's dim of sklearn is different from the dim of converted pytorch model. For sklearn, 15 features -> 69 dim,but for converted pytorch mdoel, 15 features -> 76 dim.
After my check, I'm sure the problem is the argument of sklearn's OneHotEncoder:
Is there any way to solve this problem?Thanks for any solution!