Open J535D165 opened 6 years ago
How do you see this change : 1) as an option of classifier init class or 2) as Miscellaneous function to change to convert from records pairs to numpy.arrays or python sets. ? please provide more details.
I am currently working on a function that receives the match_index (pandas.MultiIndex ) and returns a list of tuples, grouping all the matched record_id. My next step would be to assign a unique id to each group- the idea is for a dataframe de-duplication to automatically generate a unique Id for all matches. Would this be useful for PRLT ? Where do would you see this integration int the API ?
Record pairs are stored in
pandas.MultiIndex
objects. For several users, this object is hard to understand. It would be nice to add an option to store record pairs in other formats like numpy.arrays of even python sets.