agurod42 / github-texify

GitHub App to automatically render TeX expressions in markdown files
https://github.com/apps/texify
MIT License
110 stars 8 forks source link

Regarding tex subfolder #7

Open atreyasha opened 6 years ago

atreyasha commented 6 years ago

Hello @agurodriguez.

I started using your texify tool recently and it is great! Thank you so much, it makes things a lot easier.

While using it, I just noticed one small issue. When I change some latex scripts in the readme.tex.md, I noticed that that tex subfolder does not delete old .svg images but instead keeps creating new ones. If one has a large readme file with many tex entries, this could maybe lead to redundant data being stored in the tex subfolder.

What are your thoughts on this issue? I have made a fork of your repository onto my account, just because I want to understand this tool better and maybe I can also contribute in some way.

Thanks again and best :)

Atreya

agurod42 commented 6 years ago

Hi @AtreyaSh,

I'm really glad you find this tool useful!

Regarding about the issue: The problem that I find is that, after one tex.md file is rendered, to safely delete an image we should be sure if it's not being used in other tex.md file in the same folder. That is because the svg images are named using a hash of the rendered TeX expression (See https://github.com/leegao/readme2tex/blob/dfd7f6bfd6f03f9fe11bf0229e5d18a589a7dfcc/readme2tex/render.py#L32).

I think the most efficient solution (to avoid iterating in the folder or trying to find text within the files) would be to keep a record of how much files are using a specific image.

Let me introduce you an example to make myself clear: Let say we have two tex.md files in the same folder, and they share one tex expression (the one that generates the 11cf243da3d578d0bb8f647ad3308400.svg file).

If we can have a map.json file inside the tex folder with this content:

{
    "11cf243da3d578d0bb8f647ad3308400.svg": 2,
    "32fa466f36f8a42e849a7445380b062d.svg": 1
}

When a tex expression is deleted (or changed) in one of those files, we can parse the map.json file to check for the counter associated to the [name of the image.svg] index. If that value is greater than 1 then we decrease it's counter and keep the svg image. If the value is 1 then we remove the pair in the map.json file and we can safely delete the svg image because we can be sure that no other file is using it.

What do you think about it?

cben commented 6 years ago

A separate counters file sounds risky under git merges, it could go out of sync. But I think it's possible to do a kinda "mark and sweep" garbage collection - scan all .md for tex/*.svg mentions, delete those that aren't named anywhere?

EDIT: I missed above that you want to avoid iterating the text files.

atreyasha commented 6 years ago

Hi @agurodriguez,

Thanks for the response and the detailed explanation. I think this is a good idea and should solve the issue. Maybe we could try a run in a minimal working environment.

agurod42 commented 6 years ago

Hello guys!

What @cben says is right, a counters file could be end with wrong info after a merge. Maybe the mark and sweep algorithm is not a bad idea. I'll try to code something, do some benchmarks and let you know.

atreyasha commented 6 years ago

Hi @agurodriguez,

One quick question. I have not worked much with GitHub apps before. Could I ask, how do you manage to create a working environment where you can test your code? Is there a way to run the code on your local host first before testing it on GitHub?

My apologies for the basic questions. Thank you.

agurod42 commented 6 years ago

@AtreyaSh what I do is to create another GitHub App, set it's visibility as private and point it's webhook url to a test backend running the code that is not ready for production.

I don't know if it's the best way to do it, but it works fine for me.

Please, feel free to ask whatever you want.

Have a good day

atreyasha commented 6 years ago

Hi @agurodriguez,

Btw I was recently using the texify app and the README.md file was not being produced from the README.tex.md file. I think this might be because rawgit is being shut down and because of that the links to svgs are no longer working. I am not 100% sure though. Just for your knowledge :snail:

About the issue discussed here, I am not sure whether it has been fixed already. Nonetheless, I proposed a pre-commit hook to solve the issue of unnecessary svgs, which works here with readme2tex (https://github.com/leegao/readme2tex/issues/22). Not sure if it might be of use for texify, but good to have it here nonetheless.

Atreya