benmiroglio / pymatch

MIT License
272 stars 128 forks source link

error in method _scores_to_accuracy() of Matcher.py #23

Closed caixiaocherry closed 3 years ago

caixiaocherry commented 4 years ago

Got error when i tried to run the example at: m.fit_scores(balance=True, nmodels=10) When the function calls the static method _scores_to_accuracy(), got error of mis matching size. In this function, y is a DataFrame with shape as (n, 1), while preds is a list. I fixed the code by convert preds to a matrix

def _scores_to_accuracy(m, X, y):
    preds = [1.0 if i >= .5 else 0.0 for i in m.predict(X)]
    return (y == preds).sum() * 1.0 / len(y)

def _scores_to_accuracy(m, X, y):
    preds = [1.0 if i >= .5 else 0.0 for i in m.predict(X)]
    # return (y == preds).sum() * 1.0 / len(y)
    return (y.to_numpy().T == preds).sum() * 1.0 / len(y)

The code above works for me.

xiaolinzhuo commented 4 years ago

I found the same problem! I fixed it with return (y.values == preds).sum() * 1.0 / len(y)

swyoon commented 4 years ago

The solution by @caixiaocherry worked for me, but @xiaolinzhuo 's didn't.

tszumowski commented 4 years ago

@caixiaocherry 's solution worked for me. This was with these versions:

pandas==0.25.1
-e git+git@github.com:benmiroglio/pymatch.git@982778f3fe438f6d6b2905472a3951722edca266#egg=pymatch

@caixiaocherry it may be worth making a PR given multiple people had this issue?

tszumowski commented 4 years ago

Actually on additional inspection, this seems similar to https://github.com/benmiroglio/pymatch/pull/12

diogoalvesderesende commented 4 years ago

Hey everyone, I am quite new to python and am having this issue as well. Can anyone tell me how to make the changes that @caixiaocherry suggested? I tried to google it but to no avail. Thanks a lot!

caixiaocherry commented 4 years ago

@diogoalvesderesende , you only need to comment out the following return statement: return (y == preds).sum() 1.0 / len(y) to following return statement: return (y.to_numpy().T == preds).sum() 1.0 / len(y)

This shall solve the problem.

w2998 commented 4 years ago

@caixiaocherry 's solution did not for me, it returned error Fitting Models on Balanced Samples: 1\100Error: 'DataFrame' object has no attribute 'to_numipy' While. @xiaolinzhuo 's solution works, but the average accuracy seems wrong, it returned value, which was greater than 1. I was using lending club's data as this article https://medium.com/@bmiroglio/introducing-the-pymatch-package-6a8c020e2009.

caixiaocherry commented 4 years ago

U sure u wrote the correct function name? It should be to_numpy not to_numipy.

On Sat, May 16, 2020 at 3:38 PM w2998 notifications@github.com wrote:

@caixiaocherry https://github.com/caixiaocherry 's solution did not for me, it returned error Fitting Models on Balanced Samples: 1\100Error: 'DataFrame' object has no attribute 'to_numipy' While. @xiaolinzhuo https://github.com/xiaolinzhuo 's solution works, but the average accuracy seems wrong, it returned value, which was greater than 1. I was using lending club's data as this article https://medium.com/@bmiroglio/introducing-the-pymatch-package-6a8c020e2009 .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/benmiroglio/pymatch/issues/23#issuecomment-629714725, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIG56LA2LKGIIGY4NXGA6ZDRR4IXLANCNFSM4I45MO3Q .

-- Xiao Cai Data Science Team | 502.296.7789 (c) | 425.298.6877 (o) | xiao@astrumu.com xiao@astrumu.com AstrumU

w2998 commented 4 years ago

Oh, yes. that was a typo. Now, it works, and results make sense.

brendachang12 commented 4 years ago

I tried both return (y.values == preds).sum() 1.0 / len(y) return (y.to_numpy().T == preds).sum() 1.0 / len(y) in the Matcher.py file but neither of them worked. Is there another way to get around this error?

This is the error I'm receiving: Fitting Models on Balanced Samples: 1\100Error: Unable to coerce to Series, length must be 1: given 4484 Fitting Models on Balanced Samples: 1\100Error: Unable to coerce to Series, length must be 1: given 4484 Fitting Models on Balanced Samples: 1\100Error: Unable to coerce to Series, length must be 1: given 4484 Fitting Models on Balanced Samples: 1\100Error: Unable to coerce to Series, length must be 1: given 4484 Fitting Models on Balanced Samples: 1\100Error: Unable to coerce to Series, length must be 1: given 4484

Average Accuracy: nan%

caixiaocherry commented 4 years ago

Unable to coerce to Series, length must be 1

Could you print out y and preds to check the shape? It seems the broadcast failed.

PeterOrmosi commented 4 years ago

@diogoalvesderesende , you only need to comment out the following return statement: return (y == preds).sum() 1.0 / len(y) to following return statement: return (y.to_numpy().T == preds).sum() 1.0 / len(y)

This shall solve the problem.

this worked for me fine, thanks

HongruZhai commented 4 years ago

@caixiaocherry 's solution works like a charm. Thanks!

RishabhArora90 commented 3 years ago

I am facing the same issue. Can anyone please explain how to change in the source code?

caixiaocherry commented 3 years ago

@RishabhArora90 , i think this bug had been patched, so you might only need to re install the package?

ziadzee commented 3 years ago

Hello,

I am currently having this issue although I have added @caixiaocherry solution to the source code:

    @staticmethod
    def _scores_to_accuracy(m, X, y):
        preds = [1.0 if i >= .5 else 0.0 for i in m.predict(X)]
        # return (y == preds).sum() * 1.0 / len(y)
        return (y.to_numpy().T == preds).sum() * 1.0 / len(y)

Am I missing something? Do I need to downgrade pandas?

Thanks

adriennekline commented 3 years ago

I made the recommended change but now I am getting an accuracy of 219400.0%. The error was stated as 'Static column dropped: resultError: Perfect separation detected, results not available.

CarlosDullius commented 3 years ago

I found the same problem! I fixed it with return (y.values == preds).sum() * 1.0 / len(y)

It worked for me. I have installed the package today

medcharleslaidi commented 2 years ago

I installed pymatch and I also getting this error. I understand that I need to change the pymatch code. I downloaded the package using pip install pymatch. Could someone point me a tutorial to help me change the code and use this function ? Thank you very much

CarlosDullius commented 2 years ago

Go on C:\Users{YOURUSER}\AppData\Local\Programs\Python\Python36\Lib\site-packages\pymatch or if you use anaconda C:\Users{YOURUSER}\Anaconda3\Lib\site-packages\pymatch And edit into Matcher.py

return (y == preds).sum() 1.0 / len(y) to following statement: return (y.to_numpy().T == preds).sum() 1.0 / len(y)

PS: I think linux are the same .../Python36/Lib/site-packages/pymatch

kelleyjbrady commented 2 years ago

I am encountering this error in April 2022, but I can see that Matcher.py has already been fixed in lines: 523-526 to:

@staticmethod
  def _scores_to_accuracy(m, X, y):
      preds = [[1.0 if i >= .5 else 0.0 for i in m.predict(X)]]
      return (y.to_numpy().T == preds).sum() * 1.0 / len(y)

Is there another solution to this bug?

philffm commented 2 years ago

@kelleyjbrady same error here - also in April 2022 🤷🏽‍♂️

CarlosDullius commented 2 years ago

@kelleyjbrady and @philffm follow my tutorial above, it will solve your problem I am sure. They din't fixed it in

master/build/lib/pymatch/Matcher.py @kelleyjbrady you probably are looking to

master/pymatch/Matcher.py

But when you install it with PIP, then you will get from the build folder.

kelleyjbrady commented 2 years ago

Thanks @CarlosDullius I will check it out. I ended up using a propensity matching package that is being presented in July at EMBC 2022, the author has written a blog post on Medium and uploaded to PyPi, but the author (@adriennekline) hasn’t updated the github page yet. I think anyone who has followed this thread all the way down here will be able to figure out how to use the package despite current lack of extensive documentation (the PyPi page has a 'quick start'). It was pretty easy to get it up and running on a simple age+sex match I was working on. @philffm may be interested.

Kwakyejin commented 1 year ago
@staticmethod
def _scores_to_accuracy(m, X, y):
    preds = [1.0 if i >= .5 else 0.0 for i in m.predict(X)]
    # return (y == preds).sum() * 1.0 / len(y)
    return (y.to_numpy().T == preds).sum() * 1.0 / len(y)
Ayeshasaeedhaq commented 1 year ago

Has anyone updated the code in the package? or has anyone created a clone with corrected code? I am not sure if I am able to correct the code at my end. Because now I am getting the following error

'bool' object has no attribute 'sum'