bokeh / bokeh

Interactive Data Visualization in the browser, from Python
https://bokeh.org
BSD 3-Clause "New" or "Revised" License
19.24k stars 4.18k forks source link

Latex axis and title labels #6031

Closed denglert closed 2 years ago

denglert commented 7 years ago

It would be nice to improve the aesthetic quality of the figures generated by bokeh, by including the option to add Latex axis labels. The cosmetic aspect is one issue, but in case of complex variables this could affect readability as well if one is forced to use simple ASCII.

There already seems to be some partial support for latex labels introduced in 0.12.2:

Although to me it seems that with this feature you can only place labels on top of the canvas, but not modify the axis labels or title. Another concerned user:

In summary adding latex support would highly improve the readability, aesthetic quality and the credibility of figures generated by bokeh.

michaelaye commented 7 years ago

Another physicist here. Especially for your Gitter-described use case of dash-boards referenced by publications I would call this an almost required feature. I need to use sub- or superscript and greek symbols all the time and readability is important for many graphs. Thanks for Bokeh!

mforbes commented 7 years ago

Agreed. A very important feature for scientific presentation.

jsignell commented 7 years ago

Assuming you are using python 3, you can use superscripts, subscripts and greek letters directly:

import numpy as np
from bokeh.plotting import figure, show, output_file

p = figure(x_axis_label="β = y³ - xᵢ")

y = np.random.random(250) * 10
x = np.random.random(250) * 10
p.circle(x=x, y=y)

show(p)
screen shot 2017-03-31 at 9 28 06 am
ajjackson commented 7 years ago

Materials chemist here, this is absolutely critical. It doesn't have to be full-blown LaTeX, we could achieve some basics with just Unicode, subscript and superscript. Some syntax for nesting multiple sub/superscript and styling individual letters would go a long way. As far as I can tell the Unicode set of subscript and superscript letters is very limited. For example, I would like to refer to the Fermi energy, which is denoted EF or to formation enthalpy which is denoted ΔHf. Unicode has no subscript "F" or "f"!

michaelaye commented 7 years ago

Just FYI, according to Wikipedia https://en.wikipedia.org/wiki/Unicode_subscripts_and_superscripts there is a 'f' subscript available.

arossi1 commented 7 years ago

I agree, @ajjackson. I'm still living with (and delivering to customers) lots of beautiful Bokeh plots that contain horrid looking axis labels. :)

ajjackson commented 7 years ago

@michaelaye I only see an "f" superscript in that table? There's a blank space below it where subscript "f" would be.

jdbocarsly commented 6 years ago

I have been following this issue (and others like) it for a long time because without it bokeh plots have not been usable in a publishable scientific context.

Is it the case that subscripts and superscripts have not been implemented into the core of bokeh because of the difficulty in getting full-blown "Latex"-like labels working? Full latex syntax would be nice on occasion, but really at least 95% of the use cases for scientists can be handled by implementing these three simple rules:

  1. Ability to switch between roman and italics within the label. Easiest minimal way to do this is make letters (but not numbers) within $$ italic, and all others roman (or whatever the user has the default style. So, we can write things like: $x$ = log($y$)

    which would be typeset as: x = log(y)

    If desired, support for changing font style with \rm, \bf, \it, etc. within $$ could be added, as is present in mathtext.

  2. superscripts as $x^1$ and $x^(1-y)$. (the y should be italics as well as superscripted!)

  3. subscripts as $x_1$ and $x_(1-y)$

These features (in conjunction with the use of unicode characters for greek letters, as shown above) would be plenty to keep most of us scientists perfectly content, I think. For the "once in a blue moon" where we want other features from latex we can deal with a more complicated process.

For what it is worth, I have implemented a way to display log axes "properly" using superscripts from unicode. However, this approach is very unsatisfactory in general because unicode is missing a lot of letters as superscripts and subscripts! Even just superscript numbers are not consistent in unicode, with the superscript 1, 2, and 3 being larger than the other numbers!

# p  is a figure with a y axis of type "log". Instead of showing the y-axis ticks
# as e.g. "10^-1", we want to display the powers of 10 as superscripts

p.yaxis[0].formatter = FuncTickFormatter(code=""
  var str = Math.log10(tick).toString(); //get exponent
  var newStr = "";
  for (var i=0; i<str.length;i++)
  {
    var code = str.charCodeAt(i);
    switch(code) {
      case 45: // "-"
        newStr += "⁻";
        break;
      case 49: // "1"
        newStr +="¹";
        break;
      case 50: // "2"
        newStr +="²";
        break;
      case 51: // "3"
        newStr +="³"
        break;
      default: // all digit superscripts except 1, 2, and 3 can be generated by adding 8256
        newStr += String.fromCharCode(code+8256)

    }
  }
  return 10+newStr;
""")

before: log_before after: log_after

bryevdv commented 6 years ago

Is it the case that subscripts and superscripts have not been implemented into the core of bokeh because of the difficulty in getting full-blown "Latex"-like labels working?

It is because there is much more work to do than there are people to do it. If someone would like to volunteer to take on this task (it's not trivial, it will take a real commitment of time and effort) I will happily personally spend time with them to get them up to speed.

enricozb commented 6 years ago

Why can't the labels just be parsed with MathJax? Not trying to trivialize the difficulty of implementing this. But MathJax seems like the way of doing this.

arossi1 commented 6 years ago

There is an example that uses KaTeX (similar to MathJax) to render LaTeX, but the label placement is problematic. Please see: https://github.com/bokeh/bokeh/issues/5824

amichaut commented 4 years ago

Hi! Coming back to bokeh after a few years, and I was hoping that Latex would be easily supported in the meantime. But I see it's not. Sometimes using greek letters is not enough. Is there any ways to use dots/arrows over letters, fraction, integrals etc.? Thanks!

nikosarcevic commented 3 years ago

Hi, all

Are there any updates on this issue? Physicist here. I am truly trying to go around this problem but it is impossible not to use math mode.

bryevdv commented 3 years ago

Adding math text support was part of the recent large grant proposal that Bokeh received, but that work is only just getting underway, and I would not expect initial results to start landing before Q3

nikosarcevic commented 3 years ago

Thanks a lot. I really really need it! Cheers from Newcastle

anoe commented 3 years ago

Computer scientist here. I am baffled as to why this feature hasn't been implemented yet? The inability to put math in titles, legends, and labels pretty much renders your otherwise amazing library useless for a very large group of potential users (like me). Strange design choice, guys. I just don't get it :-)

bryevdv commented 3 years ago

Computer scientist here. I am baffled as to why this feature hasn't been implemented yet? The inability to put math in titles, legends, and labels pretty much renders your otherwise amazing library useless for a very large group of potential users (like me). Strange design choice, guys. I just don't get it :-)

@anoe Because it's actually very difficult, and there are lots of tradeoffs and compromises with any solution. That really makes this a people problem even more than a technical one, which is always the harder problem. Not that I really think you are owed more than that but since you have irritated me into commenting:

All of that is just off the top of my head. There's certainly more to contend with. Finally while this is indeed very important to a particular subset of users, it is in fact not important at all to the majority of users, and there are (lots) of other priorities, and not enough resources. The reason we will hopefully 🤞 be able to get this done soonish, after so long, is because we literally got granted a giant pile of money dedicated to paying people to work on it.

Now I have a legitimate question for you: what are you hoping to accomplish with a comment like this towards OSS maintainers? This is not a rhetorical question, I'd like you to walk me through what the purpose of this comment was for, exactly.

anoe commented 3 years ago

@bryevdv I am sorry that you seem to be taking offense. I re-read my own comment, and I honestly don't see it being nasty or mean. It is just very honest --- including that bokeh is amazing btw :-)

I had literally just realized that bokeh doesn't have this, to me, essential (even fundamental) feature. After having been eying bokeh for years (was watching youtube vids etc. when it first appeared), I finally decided to give it a proper try. And I just assumed that after 5+ years, or so (?), bokeh would of course have something as basic as math rendering covered. After all, both matplotlib and plotly have it, so it is a pretty reasonable assumption to have, I think. So, yes, I am indeed baffled to learn that it is not. Perhaps the most surprising thing to me is that it wasn't a priority from day one. Because, from my perspective as a data scientist, that is how important and fundamental math is to anything that involves plotting anything.

The second that bokeh gets this feature, I will start using it immediately. Because matplotlib is outdated, and the python interface of plotly is convoluted, confusing, and poorly documented. Bokeh wins on all parameters... except for one.

michaelaye commented 3 years ago

@anoe you implied that it was a design choice, instead of a lack of resources issue. @bryevdev probably naive question, but the existing implementation of matplotlib doesn’t help? Too much of a code coffin to just attach to bokeh?

-- Michael Aye LASP / CU Boulder

On Thu, Mar 4 2021 at 02:38, anoe < notifications@github.com > wrote:

@bryevdv ( https://github.com/bryevdv ) I am sorry that you seem to be taking offense. I re-read my own comment, and I honestly don't see it being nasty or mean. It is just very honest --- including that bokeh is amazing btw :-)

I had literally just realized that bokeh doesn't have this, to me, essential (even fundamental) feature. After having been eying bokeh for years (was watching youtube vids etc. when it first appeared), I finally decided to give it a proper try. And I just assumed that after 5+ years, or so (?), bokeh would of course have something as basic as math rendering covered. After all, both matplotlib and plotly have it, so it is a pretty reasonable assumption to have, I think. So, yes, I am indeed baffled to learn that it is not. Perhaps the most surprising thing to me is that it wasn't a priority from day one. Because, from my perspective as a data scientist, that is how important and fundamental math is to anything that involves plotting anything.

The second that bokeh gets this feature, I will start using it immediately. Because matplotlib is outdated, and the python interface of plotly is convoluted, confusing, and poorly documented. Bokeh wins on all parameters... except for one.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub ( https://github.com/bokeh/bokeh/issues/6031#issuecomment-790475711 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AAARBDXXXEHEONUYSOYUPJTTB5IJDANCNFSM4DEQECDA ).

mattpap commented 3 years ago

but the existing implementation of matplotlib doesn’t help?

It doesn't, because matplotlib has access to a proper OS native latex distribution (say texlive), whereas bokeh renders in a web browser, so we are limited to libraries like MathJax or Katex (or perhaps an in-house implementation), that reimplement a subset of such a distribution for web browsers.

bryevdv commented 3 years ago

Exactly. There’s nothing wrong with MPL code, MPL just has the luxury of controlling and rendering everything in Python. All of the uncertainty and risk that raises the activation energy for starting this task so high (relative to other priorities that get picked over it) ultimately stems from the fact the actual work done by Bokeh happens in the browser, at a different time, not in Python. As for plotly, they are a fifty person company with tens of millions of dollars in funding and revenue.

nikosarcevic commented 3 years ago

I'm following the conversation here and somehow I feel guilty for commenting and reviving the issue thread. I also assumed you guys in Bokeh are a small team and my gods, I hope some entity throws cash your way so you can make this tex thing happen (and other stuff too). Because although I am trying out other stuff (holoviews etc), I keep getting back to bokeh cuz it is truly cool. And tysm for doing all the work so far (and especially having the absolute best community support).

bryevdv commented 3 years ago

I hope some entity throws cash your way so you can make this tex thing happen

This feature is part of the CZI grant the project received at the start of the year. Work is already (just) starting!

mforbes commented 3 years ago

@bryevdv Thanks for laying out the issues and design questions: that is extremely helpful, and gives those of us not so familiar with the details of Bokeh a place to start. I will echo that the inability to include math in labels is the main reason I stopped exploring Bokeh - I can't justify learning the library and spending time customizing graphs unless I know I can use them in the end for scientific presentations. See this on the short-term roadmap encourages me to get involved if I can.

Will progress be discussed here or somewhere else (on the Discourse server for example)? How can interested developers get involved?

Thanks for all of the great work, and useful information!

michaelaye commented 3 years ago

Ah, of course, browser vs OS. If you'll allow me another naive one: What about the Jupyter Markdown Latex rendering code then? Would that help? Obviously you guys have thought about all of this, I just like to understand and appreciate the complexity of this task. But feel free to ignore and go on with your work! :)

bryevdv commented 3 years ago

What about the Jupyter Markdown Latex rendering code then?

Bokeh can't assume, and has to work independently of, the notebook. Their rendering is also (AFAIK) concerned with the DOM level, i.e. notebook cells. The hard part about Bokeh is that it mostly renders to an opaque canvas, which is a black-box as far as the DOM is concerned. Bokeh has to handle everything inside the canvas manually. But it's not clear yet that things like mathjax and katex can even render directly into the canvas due to browser security restriction on canvases (AFAIK there has been some experiments with e.g. SVG ForeignObject in the past to allow them to render directly on the canvas, but we will need to look again at what the state of that is). If those tools can't render directly into the the canvas, then the only option we are left with is rendering to little divs on top of the canvas and positioning them absolutely. This sucks in a few ways: it means math text content can't respect our render level system for rendering inside the canvas, and more importantly can't take advantage of the work that went in to building that system.

Will progress be discussed here or somewhere else (on the Discourse server for example)? How can interested developers get involved?

The internals of Bokeh have a steep learning curve. Right now there is mostly ongoing high-bandwidth direct video calls happening to discuss things and help ramp the new folks up quickly. As work progresses there will be PRs and more granular issues on the tracker. One issue https://github.com/bokeh/bokeh/issues/10995 was already opened to align on how a general refactoring of text rendering should proceed.

@tcmetzger you took some great notes in a recent call to explicitly align on scope and goals, perhaps you can summarize here?

michaelaye commented 3 years ago

But it's not clear yet that things like mathjax and katex can even render directly into the canvas due to browser security restriction on canvases

Ouch, that's a pity. Thanks much for the insight!

p-himik commented 3 years ago

it's not clear yet that things like mathjax and katex can even render directly into the canvas due to browser security restriction on canvases

How can it be possible, given that Bokeh can render directly onto a canvas? Or did you mean that it's not possible to render SVG onto canvas and those two can render only with SVG?

BTW there's also this, might be useful: https://github.com/CurriculumAssociates/canvas-latex

bryevdv commented 3 years ago

@p-himik Browser security policies disallow rendering DOM elements into the canvas, and basic usage of these libraries renders into divs. There was some experiments with katex/mathjax to render to SVG instead and potentially having a way to inject that SVG into canvas as a ForeignObject but I don't know what the current status of that is.

I was not aware of that tool so it is potentially something to look into (though more research is also more work) but we would also need something that works outside the canvas so that e.g. math text can go in text outside plots as well.

mattpap commented 3 years ago

BTW there's also this, might be useful: https://github.com/CurriculumAssociates/canvas-latex

There was also a working, but rejected PR to katex, that added rudimentary support for canvas rendering. This seems to work, but we need to establish to what extent, because katex's AST are DOM biased, and translation to non-biased AST is non trivial, especially given that we don't have access to full font metrics on canvas (at least not without parsing fonts on our own).

bryevdv commented 3 years ago

FWIW I am personally pretty settled on advocating for Mathjax:

tcmetzger commented 3 years ago

The tentative scope and goals for a first phase of LaTeX implementation @bryevdv mentioned earlier are:

There are more math text-related goals that we hope to tackle in the future. We will try to keep linking this issue to the respective issues and pull requests related to all things math text so that everybody can add to the discussions there. We always appreciate your comments and contributions!

nikosarcevic commented 3 years ago

The tentative scope and goals for a first phase of LaTeX implementation @bryevdv mentioned earlier are:

  • Mathematical notations are correctly rendered in the following elements: Titles, axis labels, tick labels, labels, and div widgets.
  • Mathematical notations are correctly rendered in PNG exports (including scaled up, high-res exports for printing).
  • The docs reflect all relevant mathematical notation-related functionalities and include examples.
  • Unit tests include tests for all relevant functionalities based on the LaTeX syntax.

There are more math text-related goals that we hope to tackle in the future. We will try to keep linking this issue to the respective issues and pull requests related to all things math text so that everybody can add to the discussions there. We always appreciate your comments and contributions!

that sounds amazing. I literally cannot wait.

If I was super rich id be supporting you guys. Unfortunately am not (just a poor physicist) but if it helps a bit - you have my eternal gratitude.

jdbocarsly commented 3 years ago

Exciting to see this issue revived and to hear that some serious (funded!) work has started :)

... matplotlib has access to a proper OS native latex distribution (say texlive), whereas bokeh renders in a web browser, so we are limited to libraries like MathJax or Katex (or perhaps an in-house implementation), that reimplement a subset of such a distribution for web browsers.

Regarding the matplotlib issue- thought it might be helpful to note that matplotlib only uses a native LaTeX distribution if you set the usetex option. By default, it uses mathtext, its own in-house implementation of "a subset" of Tex: https://matplotlib.org/stable/tutorials/text/mathtext.html In my opinion the default in-house implementation is more than sufficient for any label I've had to make, and looks better than plots made with usetex anyways.

The code to parse and render the math is here:

Of course, this is all written in python, not javascript, so its usability in Bokeh may be limited. However, still may be of note if the developers want to develop an in-house system not requiring mathjax/katex.

Thank you for all the work on this library!

tcmetzger commented 3 years ago

@denglert @michaelaye @mforbes @jsignell @ajjackson @arossi1 @jdbocarsly @enricozb @amichaut @nikosarcevic @anoe @p-himik Thank you all for your thoughts on this and for your patience! Bokeh 2.4 was released today, finally bringing LaTeX (and MathML) support to Bokeh. At this point, you can use LaTeX on axis labels, tick labels, div widgets, and paragraph widgets. Check out this example (by @ianthomas23) that demonstrates these new features: https://docs.bokeh.org/en/latest/docs/gallery/latex_blackbody_radiation.html

image

We hope to add LaTeX support to more elements soon. For more information about the new math text feature and how to use them, see our release blogpost and the Bokeh user guide!

Will-Cooper commented 2 years ago

Hi, this is an awesome update thanks for getting this working! I'm able to reproduce the working example, however, I found when using components instead of show (I'm working on a flask driven application w/ jinja2 templates); the labels are printed as literal, e.g. Wavelength $$\mu m$$ is what appears on the axis label. I have a workaround in my current simple case, in that I can just directly print μ in the python; but wondered why the components approach doesn't work?

mattpap commented 2 years ago

Axis labels currently don't support composite text and math text rendering, so it shouldn't work regardless of how bokeh visualisations are put together.

mattpap commented 2 years ago

Also, if you just need a greek letter, then using unicode for this, is a perfectly valid solution, which avoids requiring including MathJax's bundle (~2 MB).

bryevdv commented 2 years ago

don't support composite text and math text rendering

Just to be clear about this, this means things should currently work if the entire label is mathtext e.g. $$wavelength \mu m$$

mattpap commented 2 years ago

$$wavelength \mu m$$ and Wavelength $$\mu m$$ are not equivalent. The former will be interpreted fully in math mode, i.e. each character is typeset individually and whitespace is mostly removed (intrinsic math token spacing is used instead).

Will-Cooper commented 2 years ago

Hmm curious, I see what you are saying, and I have tried as you suggest and some further:

p.xaxis.axis_label = r'$$Wavelength [\mu m]$$'
... = '$$Wavelength [\mu m]$$'
... = r'$$\mu$$'
... = '$$\mu$$'

Also swapping the $$ for \[\] & \(\), also other letters like nu or pi. It's stopped printing the $$ like it was before but the Greek letter isn't being printed, only the literal input. I'm definitely using bokeh version 2.4.0 (I even made sure to print it mid script), and it works fine as described if I quickly open up ipython and use show instead of components?

bryevdv commented 2 years ago

@Will-Cooper using components means specifying the bundles explicitly in your template. Have you added the mathjax bundle?

allefeld commented 2 years ago

$$wavelength \mu m$$ and Wavelength $$\mu m$$ are not equivalent.

True, but $$\text{wavelength }\mu m$$ should be equivalent to the latter:

Screenshot_20210922_174833

It would certainly be nicer if regular text and latex math could be mixed, but I'd guess this is a viable workaround?

Will-Cooper commented 2 years ago

@Will-Cooper using components means specifying the bundles explicitly in your template. Have you added the mathjax bundle?

Ah right you are, thankyou that works! I had missed that was a requirement when following the styling guide, sorry! For anyone else this may be relevant for, that means adding the line in the head of your template:

<script type="text/javascript" src="https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-2.4.0.min.js"></script>

$$wavelength \mu m$$ and Wavelength $$\mu m$$ are not equivalent.

True, but $$\text{wavelength }\mu m$$ should be equivalent to the latter:

Screenshot_20210922_174833

It would certainly be nicer if regular text and latex math could be mixed, but I'd guess this is a viable workaround?

Agreed with this, that works well! One minor additional comment is that the font colour used doesn't follow the inbuilt themes (I'm using the "night sky" theme as it looks cool), i.e. it's remaining as the default black when the rest of the text on the plot is white.

tcmetzger commented 2 years ago

@Will-Cooper FYI: Math text using colors defined in a theme (and colors defined for the Python model in general) is something @IuryPiva is currently working on (#11636). Mixing math text and regular text is also something that will hopefully be available in one of the next releases!

Until then, you can use the MathJax color extension to manually set the color for a specific math text string, like r"$$\color{white} \nu \:(10^{15} s^{-1})$$" in the blackbody radiation example.

tcmetzger commented 2 years ago

@Will-Cooper https://docs.bokeh.org/en/latest/docs/user_guide/embed.html#components in the docs already mentions the mathjax bundle, but I'll try to make things a little clearer around there!

EDIT: The docs updates are in https://github.com/bokeh/bokeh/pull/11668