Closed Melkaz closed 1 year ago
Nice catch... I don't have time to look at this as I'm on holiday for the next 3/4 weeks. I'll take a look as soon as I get back, if no one else has.
Hello.
I tried to run the same code as you and I got an error:
ValueError: shapes (1,4) and (6,4) not aligned: 4 (dim 1) != 6 (dim 0)
Any ideas why does this happen?
I think I ran into the same problem. I used the following code:
import pandas as pd
import numpy as np
from prince import FAMD
famd = FAMD(n_components=3)
df = pd.DataFrame(np.random.randint(0, 100, size=(100, 4)), columns=list('ABCD'))
df["A"] = df["A"].astype("category")
famd.fit(df)
print(famd.transform(df[0:5]))
print(famd.transform(df)[0:5])
When I ran it, the output was:
0 1 2
0 -0.346965 0.016770 -0.020404
1 0.150889 0.043458 0.121991
2 0.162368 -0.073471 -0.066656
3 -1.506365 1.515455 0.937305
4 0.266976 -0.104443 0.016896
0 1 2
0 -0.352857 0.029198 -0.059385
1 0.190614 0.034506 0.126858
2 0.272716 -0.154023 -0.174798
3 -6.194747 6.564436 4.242015
4 0.414774 -0.215533 0.112739
I also noticed that PCA and MCA both work well independently. Looking at the code, I think it might be related to this line -> https://github.com/MaxHalford/prince/blob/ba8a66b6575320832b118186745ecfd85c896bdc/prince/mfa.py#L98
The data is normalized before transforming there, but it should be normalized based on the fitted data.
If this is still a bug can someone please produce a minimum working example?
@MaxHalford I think it does not get much more minimal than this:
import pandas as pd
from prince import FAMD
famd = FAMD(n_components=1)
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df["A"] = df["A"].astype("category")
famd.fit(df)
print(famd.transform(df[0:1]))
print(famd.transform(df)[0:1])
Output:
0
0 -0.5
0
0 -1.414214
Hello there 👋
I apologise for not answering earlier. I was not maintaining Prince anymore. However, I have just refactored the entire codebase. This refactoring should have fixed many bugs.
I don’t have time and energy to check if this fixes your issue, but there is a good chance it does. Feel free to reopen this issue if the problem persists after installing the new version — that is, version 0.8.0 and onwards.
Hello,
I'd like to:
In the example below, I have a dataframe that I use to fit a model. I then pick a single row from the original dataframe and attempt to project it with the created FAMD model.
Why do I obtain different values between:
Is there something I missed ?
Thanks !
Running code here: https://repl.it/repls/WearyMajorNature