dangmai / escape-latex

Escape LaTeX special characters with Javascript
MIT License
19 stars 4 forks source link

Shouldn't whitespace characters be escaped as well #15

Closed FSMaxB closed 6 years ago

FSMaxB commented 6 years ago

I'm not sure if this is something that you want for this library, but since I'm trying to use it to display a string mostly as is in LaTeX, it would be really useful if it would replace whitespace characters as well, because otherwise I would have to do it myself.

See https://github.com/josdejong/mathjs/issues/995

dangmai commented 6 years ago

I'm not opposed to the idea, but since I haven't used LaTeX in a while I'd need some help implementing that functionality, could you give me some pointers about:

FSMaxB commented 6 years ago

I'm not too deep into LaTeX myself, but I consider ~ which is a nonbreaking space in LaTeX a good replacement, although a different one might be necessary for tabs.

This would be useful not only in math mode but also in normal text if multiple whitespaces should produce multiple ones in the output.

One thing I'm not sure about is if this is something that should be enabled by default or require a configuration option (like a second parameter for example).

The actual escaping can be done using a regex that is applied after all previous replacements are complete.

I can make a pull request if you want, once the technical details have been agreed upon.

dangmai commented 6 years ago

Thanks for the quick response, here are a few quick thoughts I have before I leave the office:

I'll mull over this a bit more this weekend, but feel free to put your thoughts here if you have any.

FSMaxB commented 6 years ago

Thanks for the quick response

Thank you for responding to me. This is your library after all.

I agree with your points about different types of escaping and that they should be separate.

Another possible issue is that the library currently escapes a string recursively, which means a risk of stack overflow if we break up a very large string at every white space character. I'd like to come up with a safer solution before we put in white space escapes, maybe some form of event emitters.

For replacing whitespace this isn't an issue at least, because it only requires one replace via a regex after all the previous replacements are finished. But I will have to take a closer look at your implementation again.

FSMaxB commented 6 years ago

And if recursion is an issue, it just has to be rewritten procedurally in a loop. It might make sense to first find all the matches, split up the string into those and then replace them one by one by a simple lookup.

dangmai commented 6 years ago

Quick progress update: I've modified the algorithm to be iterative instead of recursive and it doesn't overflow the stack anymore for very large strings. I'll tackle the different type of escapes in the next couple of days.

dangmai commented 6 years ago

I've added a param that let the user choose whether to preserve formatting or not, and also one that lets them modify the escape mapping if they so choose. At this point, we should brainstorm about which characters that should be formatted. I'll do some research about it in the next couple of days, do you have suggestions in the mean time?

FSMaxB commented 6 years ago

Thanks!

For now I can only think of space -> ~ and tab -> maybe \qquad.

dangmai commented 6 years ago

Do you think we need one for \n? Looks like it can be \\ or \newline, but it seems that those won't stack (i.e. multiple new lines won't be generated by LaTeX)

FSMaxB commented 6 years ago

Strange, for me they do. But not in math mode though. But I think it is just not possible in math mode. (at least not without doing any matrix tricks or similar.

dangmai commented 6 years ago

Sorry it took me a long time to get back to this. I've published a beta version that contains the new changes we've discussed in this issue, you can try it by running:

npm install escape-latex@next

The README has the most up-to-date API for the library. Feel free to try it out and let me know what you think. I'll push the actual release version in about a week if nothing is wrong with the beta version.

FSMaxB commented 6 years ago

I just started integrating the new version and it generally works great, but there is one problem that has to be taken into account: If a word follows a tab or newline, it breaks the \qquad and \newline.

I think this can be fixed by adding an additional {} to the end, the same as with \textbackslash{} for example.

dangmai commented 6 years ago

Thank you for spotting that! I totally forgot about that case 🤦‍♂️

I've pushed version 1.0.0-beta.2 to fix that issue.

FSMaxB commented 6 years ago

Your beta2 doesn't actually have the fix in it!

FSMaxB commented 6 years ago

(either that or there is a bug in npm)

dangmai commented 6 years ago

Nope no bug on npm side, just carelessness on my part! 😊 I've pushed 1.0.0-beta.3 and verified that it has the fix in it

FSMaxB commented 6 years ago

Now I've tested it and it works great. I haven't tested the functions that allows changing the escape map though.

dangmai commented 6 years ago

Ok great, I'll close this issue and release the final 1.0.0 version soon. Feel free to open another one if you need more escape characters or anything else.

Thanks for the help 🥇

FSMaxB commented 6 years ago

Great, thanks a lot!