Closed wayner9 closed 11 months ago
You will need to do a few steps to convert the percentage string to a percentage number in Excel:
The following example from the XlsxWriter docs demonstrates this for a dataframe:
https://xlsxwriter.readthedocs.io/example_pandas_percentage.html
import pandas as pd
# Create a Pandas dataframe from some data.
df = pd.DataFrame({"Names": ["Anna", "Arek", "Arun"], "Grade": ["100%", "70%", "85%"]})
# Convert the percentage strings to percentage numbers.
df["Grade"] = df["Grade"].str.replace("%", "")
df["Grade"] = df["Grade"].astype(float)
df["Grade"] = df["Grade"].div(100)
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter("pandas_percent.xlsx", engine="xlsxwriter")
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name="Sheet1")
# Get the xlsxwriter workbook and worksheet objects.
workbook = writer.book
worksheet = writer.sheets["Sheet1"]
# Add a percent number format.
percent_format = workbook.add_format({"num_format": "0%"})
# Apply the number format to Grade column.
worksheet.set_column(2, 2, None, percent_format)
# Close the Pandas Excel writer and output the Excel file.
writer.close()
Thanks, so it looks like I have to set a number format for an entire column, using set_column. So to set the format for columns B and C I would use set_column(1,2,None,percent_format). It looks like I could also do this for a row with set_row(32,None,percent_format).
It looks like the set_row command only operates on a single row, whereas set_column operates on a range of columns. Is that correct?
Is there a set_range command where I could set a range of cells to a specific format, as in set_range(A1,D5,percent_format)?
Or is the only option to set formatting for an entire row or column?
And if I were to do multiple format calls on both columns and rows would the latest set_row (or set_column) be the formatting used if a used a set_column call to set one format and then a set_row to set another format?
Below is the dataframe that I am bringing into excel. I want columns 4 and 5 to be percentage with 0 decimals (0% in excel formatting custom code nomenclature). I want all other columns to be percentage with one decimal (0.0%). I want row ten to be general format with two decimal places (0.00).
So the way to do this would I would do something like this:
set_column(1,15,None,percent_onedec_format)
set_column(3,4,None,percent_zerodec_format)
set_row(9,None,twodec_format)
Does that make sense?
ExpRet Vol Min Max Iter1 Iter2 Iter3 Iter4 Iter5 Iter6 Iter7 Iter8 Iter9 Iter10
FGE 5.7% 16.0% 0% 100% 0.0% 1.8% 3.0% 12.2% 30.4% 43.8% 50.0% 50.0% 50.0% 50.0%
FFI 4.8% 6.0% 0% 100% 0.0% 10.5% 20.3% 33.1% 19.5% 6.2% 0.0% 0.0% 0.0% 0.0%
FRA 4.5% 10.0% 0% 25% 0.2% 0.9% 1.3% 4.7% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
FAR 6.5% 7.0% 0% 25% 0.4% 15.2% 25.0% 25.0% 25.0% 25.0% 25.0% 25.0% 25.0% 25.0%
FCO 7.0% 8.0% 0% 25% 0.1% 10.0% 19.3% 25.0% 25.0% 25.0% 25.0% 25.0% 25.0% 25.0%
CASH 2.5% 0.5% 0% 100% 99.3% 61.7% 31.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Return 2.5% 3.8% 4.9% 5.8% 6.0% 6.1% 6.2% 6.2% 6.2% 6.2%
Vol 0.5% 2.0% 3.6% 5.1% 6.7% 8.2% 9.0% 9.0% 9.0% 9.0%
Sharpe 0.05 0.67 0.68 0.65 0.53 0.45 0.41 0.41 0.41 0.41
Ploss 0.0% 3.0% 8.5% 12.8% 18.4% 22.8% 24.6% 24.6% 24.6% 24.6%
Var05 1.7% 0.5% -1.0% -2.6% -5.0% -7.4% -8.7% -8.7% -8.7% -8.7%
Var01 1.3% -0.9% -3.4% -6.1% -9.6% -13.0% -14.8% -14.8% -14.8% -14.8%
It looks like the set_row command only operates on a single row, whereas set_column operates on a range of columns. Is that correct?
Yes.
Is there a set_range command where I could set a range of cells to a specific format, as in set_range(A1,D5,percent_format)?
No, unfortunately note. There is conditional formatting that can be applied to a range but I wouldn't use it for this.
Or is the only option to set formatting for an entire row or column?
Yes. Unless, you want to want to iterate over each column of the dataframe and write them out individually using XlsxWriter directly and skip the to_excel()
step.
And if I were to do multiple format calls on both columns and rows would the latest set_row (or set_column) be the formatting used if a used a set_column call to set one format and then a set_row to set another format?
Row and Column formats don't merge automatically so in that case you would need to do individual cell formatting.
So the way to do this would I would do something like this:
Yes, that looks right.
Question
I asked this question on Stack Overflow, but I am not getting any bites.
See here: https://stackoverflow.com/questions/77603992/storing-a-dataframe-to-excel-as-numbers-with-percent-or-other-formatting
When I save a df to excel that contains elements like 20.3% it is being written as text, even when I use the
engine_kwargs={'options': {'strings_to_numbers': True}}
argument. Is there any way to have these be imported into excel as numbers with a 0.0% formatting style? See the Stack Overflow link for more details.I can also send the data to excel write as floats, as in 0.203, but then I have to do formatting in Excel, when I have already done the proper formatting in Python/Pandas.