Open NicolasRouquette opened 6 months ago
The subsequent kernels also assume data is numeric, so I suggest making it explicit in the documentation.
I added in the readme:
Disclaimer: Note that the data is assumed to be real numbers. The algorithm only accepts data in the form of a 2D array of shape (n_features,n_samples). Other shapes will be rejected, and other types of data will be treated as real numbers.
This disclaimer is supported by the following logic in the definition of GraphDiscovery
objects.
and added more precise description in the docstrings of the GraphDiscovery
object
The normalization logic assumes the data is numeric:
https://github.com/TheoBourdais/ComputationalHypergraphDiscovery/blob/5cfe4349119ea8ee58dcce336f75959e3996b294/src/ComputationalHypergraphDiscovery/_GraphDiscoveryMain.py#L73
I suggest documenting this explicitly in the README. For example, should we delete or transform such columns into floating point numbers if we have CSV data with boolean or enum variables?