shakedzy / dython

A set of data tools in Python
http://shakedzy.xyz/dython/
MIT License
496 stars 102 forks source link

Unable to find associations for dataset with 40k variables #109

Closed lyrasheu1210 closed 2 years ago

lyrasheu1210 commented 2 years ago

Version check:

Run and copy the output:

import sys, dython
print(sys.version_info)
print(dython.__version__)

sys.version_info(major=3, minor=8, micro=8, releaselevel='final', serial=0)
0.6.8

Describe the bug:

Code to reproduce:

import dython

import pandas as pd
from dython.nominal import associations

tbl = pd.read_csv("dhs_table.csv")
hi = associations(tbl, nom_nom_assoc='cramer',clustering=True,compute_only=True)
bye = hi["corr"]
bye.to_csv("dhs_correlation.csv") 

Error message:

Error message:

warnings.warn(
/anaconda3/lib/python3.8/site-packages/dython/nominal.py:137: RuntimeWarning: Unable to calculate Cramer's V using bias correction. Consider using bias_correction=False

Input data:

Input data contains ~40,000 columns of variables x 33 rows do I have to paste the whole data table?

             0  1       2      3

CENH3 G B E E H1 B G E F H2A B E G F H2AK121ub A E A G H2Aub A E C G H2B B E G D H2Bub F E F F H3ac G E G E H3 B E G G

shakedzy commented 2 years ago

That's not an error. It's a warning.

RuntimeWarning: Unable to calculate Cramer's V using bias correction. Consider using bias_correction=False