DOC dataset fetch_* function return types unclear

romanlutz commented 1 year ago

Describe the issue linked to the documentation

Example:

I'm confused. What happens if return_X_y=False? It's not mentioned. I happen to know the implementation, but other readers may not, so this needs clarifying.

This applies to basically every function in the fairlearn.datasets module AFAIK.

Suggest a potential alternative/fix

After reading this I'm thinking there are a few cases:

return_X_y=True and as_frame=False gets me (data, target) with ndarrays
return_X_y=True and as_frame=True gets me (data, target) with pandas.DataFrame and Series
return_X_y=False is not mentioned. Should it be mentioned in the first case (dataset)?

nit: I would also like there to be a colon instead of full stop in the first line after "attributes."

MiroDudik commented 1 year ago

I think that much of it originates with https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_openml.html

I think that it is confusing to have three cases. The two return_X_y=True cases should be merged and the logic of the return types just described (the same as we do with data and target fields under Bunch).

Also, I think that we necessarily need to mention return_X_y=False under Bunch--since this is the default behavior--but I don't mind either way.

If anything, I think it would be great if we could indent the listing of the attributes of Bunch.

hildeweerts commented 1 year ago

Closed by #1216

fairlearn / fairlearn

DOC dataset fetch_* function return types unclear #1205

Describe the issue linked to the documentation

Suggest a potential alternative/fix