microsoft / python-in-excel

Python in Microsoft Excel
MIT License
490 stars 31 forks source link

DataFrame index treated inconsistently by array_preview #39

Open fzumstein opened 9 months ago

fzumstein commented 9 months ago

Sometimes it prints the index, sometimes it doesn't, see screenshot. Can you confirm whether this is a "feature" or a bug?

Screenshot 2023-10-02 at 3 23 41 PM
keyur32 commented 9 months ago

Sorry for the delay. Yes, this is intentional.

When output as is specified to Excel values, a DataFrame will only output the index if the values of the index column are not numeric (in the case of describe(), or group_by()) OR if the index name has been set.

Will add this to our dataframes documentation.

Was there something else you were expecting?

fzumstein commented 9 months ago

Hm, according to your rules, df2 should not show the index (numeric index without an index name), but it does.

keyur32 commented 9 months ago

Thanks Felix! Good point. Follow up with the team, to understand what is going on there.

fzumstein commented 9 months ago

Thanks! I'd still love to be able to decide myself whether the index is shown or not. If I give my index a name in sheet1.A1, which will have a side effect on what I do in sheet10.Z50, that could be very confusing. I'd argue that the conversion from Python object to values would be much cleaner via Data Type properties (such as A1.arrayPreview that exists today), so you could choose A1.valuesWithIndex or A1.values (which would default to no index).

KimJun9011 commented 9 months ago

Adding a comment here. This is expected. If index is explicitly set by users (in this case index = [0,1,2]), we show the index on the UI. By default, pandas dataframe creates auto generated range index if index is not explicitly set by users. We suppress those auto generated index, which explains why you don't see index generated for df1.