Closed danielruss closed 4 years ago
It has being fixed. Update your version to 0.1.2. Thanks
I updated the link for the CDN to 0.1.2, and I got the same results.
<script src="https://cdn.jsdelivr.net/npm/danfojs@0.1.2/dist/index.min.js"></script>
That's true. I was able to reproduce the error. we will fix it.
Hello, If you have a dataframe that contains strings, then you cannot sort it. So I tried to use the label encoder and create a new column filled with the encoded labels. If you sort on the encoded labels, the dataframe is corrupted.
Here is an browser example:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <script src="https://cdn.jsdelivr.net/npm/danfojs@0.1.1/dist/index.min.js"></script> <title>Document</title> </head> <body> <script> df = new dfd.DataFrame({ X1: ["c", "a", "b", "c", "c", "a", "b"], X2: ["^", "%", "!", "#", "$", "&", "*"], }); let encoder = new dfd.LabelEncoder(); let x2 = df.X1.unique().values.sort(); console.log(x2); encoder.fit(x2); df.addColumn({ column: "X1_encoded", value: encoder.transform(df.X1) }); df2 = df.sort_values({ by: "X1_encoded" }); df.print() df2.print() </script> </body> </html>
The output is:
i X1 X2 X1_encoded 0 c ^ 2 1 a % 0 2 b ! 1 3 c # 2 4 c $ 2 5 a & 0 6 b * 1 i X1 X2 X1_encoded 1 a % 0 1 a % 0 2 b ! 1 2 b ! 1 0 c ^ 2 0 c ^ 2 Notice that the rows are all the same for each level of X1_encoded. The original rows 3-6 are lost.
I was also able to reproduce this error. It occurs when sorting a column with non-unique entries. The index gets duplicated. This is a minor fix @steveoni is currently on it.
Hello, If you have a dataframe that contains strings, then you cannot sort it. So I tried to use the label encoder and create a new column filled with the encoded labels. If you sort on the encoded labels, the dataframe is corrupted.
Here is an browser example:
The output is:
Notice that the rows are all the same for each level of X1_encoded. The original rows 3-6 are lost.