leo-colisson / robust-externalize

A LaTeX library to cache pictures (including tikz, python code, and more) in a robust, customizable, and pure way.
7 stars 2 forks source link

robExt-remove-old-figures.py not working / no idents #12

Closed dflvunoooooo closed 7 months ago

dflvunoooooo commented 7 months ago

My robExt-remove-old-figures.py is broken. It has a leading space in every line and all lines have no ident. Here is my file:

 #!/usr/bin/env python3
 import os
 import re
 import glob
 # Just run this script in order to remove all old figures not listed in robExt-all-figures.txt.

 # Note that this part is not extracted from the pdf file since it might be different on a previous run. You can however hardcode
 # it here, your updated script will not be overriden unless you remove it yourself.
 prefixes = [ "robExt-" ]
 folders  = [ "robustExternalize" ]

 def main():
 imagesToKeep = dict()
 list_all_figures_file = glob.glob('*robExt-all-figures.txt')
 for filename in list_all_figures_file:
 with open(filename) as f:
 for line in f:
 line = line.strip()
 if line.endswith('.tex'):
 imagesToKeep[line[:-4]] = True # The exact value is not important, we mostly use dict to get ~O(1) access

 listOfFilesToRemove = []
 # We are looking for images in the folders
 for folder in folders:
 for root, dirs, files in os.walk(folder):
 for f in files:
 for prefix in prefixes: # Not the most efficient, but anyway we typically have a single prefix
 # In case prefix contains weird caracters that collide with regexps:
 prefixEsc = re.escape(prefix)
 # result_search = re.search(rf"^({prefixEsc}[A-F0-1]{32}).*", f)
 result_search = re.search(rf"^(.*[A-F0-9]{{32}}).*", f)
 if result_search:
 if result_search.group(1) not in imagesToKeep:
 listOfFilesToRemove.append(os.path.join(root,f))
 for f in listOfFilesToRemove:
 print(f"-- {f}")
 print(f"Above are the files to remove, are you sure you want to proceed? [y/N] (based on prefixes {prefixes})")
 x = input().strip()
 if x not in ["y", "Y"]:
 print("All right, we abort.")
 exit(1)
 for f in listOfFilesToRemove:
 os.remove(f)
 print(f"Removed {f}")

 if __name__ == '__main__':
 main()

This will not work with python.

tobiasBora commented 7 months ago

Oh really? That's really weird, I never experienced this, and the code here is fine (for now yiu can use this file): https://github.com/leo-colisson/robust-externalize/blob/70be48fd00cfe3d3c6759d86ea31482822a658a8/robust-externalize.sty#L168

What is your OS? You compile still with xelatex?

PS: sorry for these bugs, the library is still young

dflvunoooooo commented 7 months ago

Ah there it is. I couldn't find the code. Thank you.

Yes, I am using XeLatex on Linux.

No problem, sorry if I bother you that much.

Edit: But the same python file gets created, when I use pdfLatex.

tobiasBora commented 7 months ago

Weird, I'm also running linux without any issue… You dowloaded the latest version from master right? Could you send me the output of the .sty file with md5sum *.sty? I'm wondering, have you created that file with a copy/paste into e.g. TexStudio? I'm wondering if texstudio could have removed the indentation during the copy/paste, could you check if the space appear in your .sty file as well?

PS: don't appologize, feedback is really useful.

dflvunoooooo commented 7 months ago

The latest, since your implementation of gnuplot. As I mentioned in the other issue, the main says it is version 2.2. But I don't think that matters. Here is the output c1e86ca87a0fa8571bfc18c146b0d6d8 robust-externalize.sty.

Texstudio is not the problem, I deleted the python file and run xlatex and pdflatex in command line and it is the same problem.

tobiasBora commented 7 months ago

Well it seems like you don't have the good version:

$ md5sum robust-externalize.sty
7c1e6cfbea25ee75acb49b643ceba3c6  robust-externalize.sty
$ cat test.tex
\documentclass[options]{article}

\usepackage{robust-externalize}

\begin{document}
Hey
\end{document}
$ pdflatex test.tex
…
$ cat robExt-remove-old-figures.py
#!/usr/bin/env python3
import os
import re
import glob
# Just run this script in order to remove all old figures not listed in robExt-all-figures.txt.

# Note that this part is not extracted from the pdf file since it might be different on a previous run. You can however hardcode
# it here, your updated script will not be overriden unless you remove it yourself.
prefixes = [ "robExt-" ]
folders  = [ "robustExternalize" ]

def main():
    imagesToKeep = dict()
    list_all_figures_file = glob.glob('*robExt-all-figures.txt')
    for filename in list_all_figures_file:
        with open(filename) as f:
            for line in f:
                line = line.strip()
                if line.endswith('.tex'):
                    imagesToKeep[line[:-4]] = True # The exact value is not important, we mostly use dict to get ~O(1) access

[…]

Make sure to download exactly the good file (my guess is not that texstudio runs the wrong command, but rather that if you copy/paste files it changes the exact formating: like in the other issue you had TABs coming from nowhere). The cleanest is maybe via git:

$ git clone https://github.com/leo-colisson/robust-externalize/
tobiasBora commented 7 months ago

More specifically, this is I think the issue explaining why you get all sorts of bugs : texstudio changes the indentation when copy/pasting: https://github.com/texstudio-org/texstudio/issues/1344

dflvunoooooo commented 7 months ago

I am on the right side no, my md5 is the same as yours, and the python file is generated correctly. Thank you again for your fast response.

dflvunoooooo commented 7 months ago

More specifically, this is I think the issue explaining why you get all sorts of bugs : texstudio changes the indentation when copy/pasting: texstudio-org/texstudio#1344

Yes I noticed that, but that doesn't lead to errors. If I copy one or two environments, I don't mind Texstudio to adapt the idention, it is easily remedied. The problem is, that the gnuplot CacheMeCode does not like tabs. If I replace all tabs with spaces everything is working. Texstudio offers a setting to replace all tabs with spaces.

tobiasBora commented 7 months ago

Ok cool. Well indentation usually doesn't matter in LaTeX but when you include non-LaTeX code in LaTeX, especially with languages that care about indentation like python, indentation matters a lot (and this issue is an example ^^).

dflvunoooooo commented 7 months ago

That is true, thank you.