thomas-saigre / tikzplotly

Export plotly figures as TikZ/PGFplots for integration in LaTeX
https://pypi.org/project/tikzplotly/
MIT License
20 stars 1 forks source link

column names should only permit whitelisted characters #7

Closed JasonGross closed 8 months ago

JasonGross commented 8 months ago

Any character in a column name that is not one of a handful of whitelisted ASCII characters should be replaced with something like _x<HEX_ENCODING>_ or something. Here is a non-working example:

python code ```python import plotly.express as px import plotly.graph_objects as go import tikzplotly from numpy import array, float32 args = { "data": [ { "hovertemplate": "=(Wpos[0] - 𝔼dim=0Wpos)WVWOWU
output logit token=%{x}
input token=%{y}", "legendgroup": "(Wpos[0] - 𝔼dim=0Wpos)WVWOWU", "marker": {"color": "#636efa", "symbol": "circle"}, "mode": "markers", "name": "(Wpos[0] - 𝔼dim=0Wpos)WVWOWU", "orientation": "v", "showlegend": True, "x": array( [ 0, 1, 2, 3, 4, ] ), "xaxis": "x", "y": array( [ -0.02036323, 0.28388417, 0.18927571, 0.12493635, 0.11766034, ], dtype=float32, ), "yaxis": "y", "type": "scatter", }, { "hovertemplate": "=(Wpos[1] - 𝔼dim=0Wpos)WVWOWU
output logit token=%{x}
input token=%{y}", "legendgroup": "(Wpos[1] - 𝔼dim=0Wpos)WVWOWU", "marker": {"color": "#EF553B", "symbol": "circle"}, "mode": "markers", "name": "(Wpos[1] - 𝔼dim=0Wpos)WVWOWU", "orientation": "v", "showlegend": True, "x": array( [ 0, 1, 2, 3, 4, ] ), "xaxis": "x", "y": array( [ -0.33489266, -0.559567, -0.29826388, -0.6002149, -0.20913883, ], dtype=float32, ), "yaxis": "y", "type": "scatter", }, { "hovertemplate": "=(Wpos[2] - 𝔼dim=0Wpos)WVWOWU
output logit token=%{x}
input token=%{y}", "legendgroup": "(Wpos[2] - 𝔼dim=0Wpos)WVWOWU", "marker": {"color": "#00cc96", "symbol": "circle"}, "mode": "markers", "name": "(Wpos[2] - 𝔼dim=0Wpos)WVWOWU", "orientation": "v", "showlegend": True, "x": array( [ 0, 1, 2, 3, 4, ] ), "xaxis": "x", "y": array( [ 0.22808522, 0.22537012, -0.06354575, 0.40831771, 0.12105861, ], dtype=float32, ), "yaxis": "y", "type": "scatter", }, { "hovertemplate": "=(Wpos[3] - 𝔼dim=0Wpos)WVWOWU
output logit token=%{x}
input token=%{y}", "legendgroup": "(Wpos[3] - 𝔼dim=0Wpos)WVWOWU", "marker": {"color": "#ab63fa", "symbol": "circle"}, "mode": "markers", "name": "(Wpos[3] - 𝔼dim=0Wpos)WVWOWU", "orientation": "v", "showlegend": True, "x": array( [ 0, 1, 2, 3, 4, ] ), "xaxis": "x", "y": array( [ 0.12717074, 0.05031281, 0.17253397, 0.06696074, -0.02957997, ], dtype=float32, ), "yaxis": "y", "type": "scatter", }, ], } fig = go.Figure() for t in args["data"]: fig.add_trace(go.Scatter(t)) fig.show() code = tikzplotly.get_tikz_code(fig) with open("test.tex", "w") as f: header = r"""\documentclass{article} \usepackage{pgfplots} \pgfplotsset{compat=newest} \begin{document} """ footer = r"""\end{document}""" f.write(f"{header}\n{code}\n{footer}") ```
LaTeX output ```latex \documentclass{article} \usepackage{pgfplots} \pgfplotsset{compat=newest} \begin{document} % This file was created with tikzplotly version 0.1.1. \pgfplotstableread{data0 (Wpos[0]_-_𝔼dim=0Wpos)WVWOWU (Wpos[1]_-_𝔼dim=0Wpos)WVWOWU (Wpos[2]_-_𝔼dim=0Wpos)WVWOWU (Wpos[3]_-_𝔼dim=0Wpos)WVWOWU 0 -0.02036323 -0.33489266 0.22808522 0.12717074 1 0.28388417 -0.559567 0.22537012 0.05031281 2 0.18927571 -0.29826388 -0.06354575 0.17253397 3 0.12493635 -0.6002149 0.40831771 0.06696074 4 0.11766034 -0.20913883 0.12105861 -0.02957997 }\dataZ \begin{tikzpicture} \begin{axis} \addplot+ [only marks, mark=*, mark options={solid, fill=636efa, color=636efa}] table[y=(Wpos[0]_-_𝔼dim=0Wpos)WVWOWU] {\dataZ}; \addlegendentry{(Wpos[0] - 𝔼dim=0Wpos)WVWOWU} \addplot+ [only marks, mark=*, mark options={solid, fill=EF553B, color=EF553B}] table[y=(Wpos[1]_-_𝔼dim=0Wpos)WVWOWU] {\dataZ}; \addlegendentry{(Wpos[1] - 𝔼dim=0Wpos)WVWOWU} \addplot+ [only marks, mark=*, mark options={solid, fill=00cc96, color=00cc96}] table[y=(Wpos[2]_-_𝔼dim=0Wpos)WVWOWU] {\dataZ}; \addlegendentry{(Wpos[2] - 𝔼dim=0Wpos)WVWOWU} \addplot+ [only marks, mark=*, mark options={solid, fill=ab63fa, color=ab63fa}] table[y=(Wpos[3]_-_𝔼dim=0Wpos)WVWOWU] {\dataZ}; \addlegendentry{(Wpos[3] - 𝔼dim=0Wpos)WVWOWU} \end{axis} \end{tikzpicture} \end{document} ```

Error message from pdflatex:

Package pgfplots notification 'compat/show suggested version=true': document ha
s been generated with the most recent feature set (\pgfplotsset{compat=1.16}).

! Undefined control sequence.
\GenericError  ...                                
                                                    #4  \errhelp \@err@     ...
l.13 }\dataZ
            ^^M
?
JasonGross commented 8 months ago

I can add

def sanitize(ch):
        if ch in "[]{}= ": return f"_{ord(ch):x}"
        # if not ascii, return hex
        if ord(ch) > 127: return f"_{ord(ch):x}"
        # if not printable, return hex
        if not ch.isprintable(): return f"_{ord(ch):x}"
        return ch

t["name"] = "".join(sanitize(ch) for ch in t["name"])

to make the code work, so this is a reasonable starting point. (Note that [] and unicode characters result in generic error, it seems, I expect {} to break things but haven't tested, and = truncates the column name when looking it up.)

thomas-saigre commented 8 months ago

You're right ! There is still some issue in the legend, which is also replaced by the sanitized code, but after some handmaid cleaning, we get a working LaTeX code !

\begin{axis}
\addplot+ [only marks, mark=*, mark options={solid, fill=blue, color=blue}] table[y=(W<sub>pos</sub>_5b0_5d_20-_20_1d53c<sub>dim_3d0</sub>W<sub>pos</sub>)W<sub>V</sub>W<sub>O</sub>W<sub>U</sub>] {\dataZ};
\addlegendentry{$(W^{pos}_0 - \mathbb{E}^{dim=0}W^{pos})W^{V}W^{O}W^{U}$}
\addplot+ [only marks, mark=*, mark options={solid, fill=red, color=red}] table[y=(W<sub>pos</sub>_5b1_5d_20-_20_1d53c<sub>dim_3d0</sub>W<sub>pos</sub>)W<sub>V</sub>W<sub>O</sub>W<sub>U</sub>] {\dataZ};
\addlegendentry{$(W^{pos}_1 - \mathbb{E}^{dim=0}W^{pos})W^{V}W^{O}W^{U}$}
\addplot+ [only marks, mark=*, mark options={solid, fill=green, color=green}] table[y=(W<sub>pos</sub>_5b2_5d_20-_20_1d53c<sub>dim_3d0</sub>W<sub>pos</sub>)W<sub>V</sub>W<sub>O</sub>W<sub>U</sub>] {\dataZ};
\addlegendentry{$(W^{pos}_2 - \mathbb{E}^{dim=0}W^{pos})W^{V}W^{O}W^{U}$}
\addplot+ [only marks, mark=*, mark options={solid, fill=red, color=red}] table[y=(W<sub>pos</sub>_5b3_5d_20-_20_1d53c<sub>dim_3d0</sub>W<sub>pos</sub>)W<sub>V</sub>W<sub>O</sub>W<sub>U</sub>] {\dataZ};
\addlegendentry{$(W^{pos}_3 - \mathbb{E}^{dim=0}W^{pos})W^{V}W^{O}W^{U}$}
\end{axis}

image