knadh / tg-archive

A tool for exporting Telegram group chats into static websites like mailing list archives.
MIT License
834 stars 124 forks source link

generate pretty xml #70

Closed milahu closed 2 years ago

milahu commented 2 years ago

... to reduce diff noise in output

rss_file and atom_file are defined in feedgen/feed.py

i tried to workaround a bug in etree, to remove empty indented lines (and trailing whitespace), but its not working 0__o

```py # tgarchive/build.py # workaround for pretty=True: rss_file and atom_file generate empty indented lines def write_patched_file(path, bytes): s_bak = bytes + b"" bytes = re.sub(b" +\n", b"\n", bytes) # remove empty indented lines if bytes == s_bak: raise Exception("failed to fix indent") #print("s:\n" + bytes.replace(b"\n", b"$\n").decode("utf8")) #print("s_bak:\n" + s_bak.replace(b"\n", b"$\n").decode("utf8")) #print(f"writing patched file {path}") with open(path, "wb") as f: f.write(bytes) f.rss_file = lambda path, **kwargs: write_patched_file(path, f.rss_str(**kwargs)) f.atom_file = lambda path, **kwargs: write_patched_file(path, f.atom_str(**kwargs)) f.rss_file(os.path.join(self.config["publish_dir"], "index.xml"), pretty=True) f.atom_file(os.path.join(self.config["publish_dir"], "index.atom"), pretty=True) ```