Closed Sanchita333 closed 1 year ago
Hi, Sorry for the late answer. Please, see this answer https://github.com/rotot0/tab-ddpm/issues/3#issuecomment-1287205628
In short, there is no way to reconstruct column names back. All I can say is that it is very likely in order of columns from original .csv file. So, if in .csv file you have [num1, num2, cat1, num3, cat2]
, then X_num=[num1, num2, num3]
, X_cat=[cat1, cat2]
. The easiest way is to do data partitioning yourself.
@rotot0 I have already partitioned the whole csv data into X_num and X_cat myself. but after generation even the generated cat dataframe and num dataframe columns are suffled.. Please have a look to my above screenshot. They are only the original category dataframe and generated category dataframe. And in the generated categorical data all the columns have got suffled. Please fix this issue. otherwise the library is of no use.
@rotot0 I have already partitioned the whole csv data into X_num and X_cat myself. but after generation even the generated cat dataframe and num dataframe columns are suffled.. Please have a look to my above screenshot. They are only the original category dataframe and generated category dataframe. And in the generated categorical data all the columns have got suffled. Please fix this issue. otherwise the library is of no use.
@shamikdhar Sorry, but I cannot reproduce your problem in my experiments. The original and generated columns are aligned. It may be a bug on your side. Or provide additional code/info on you problem, please. Also, you might want to open another issue.
I am using provided churn dataset as input and in output I am getting generated categorical and numerical columns in .npy format....there are 4 categorical and 7 numerical columns . How to identify names of those columns?