wodny / ncdu-export

Standalone ncdu export feature
GNU General Public License v3.0
34 stars 9 forks source link

emoji file name crashes `ncdu` #5

Open Atemu opened 3 days ago

Atemu commented 3 days ago

Repro:

$ touch 🧡
$ ./find.sh . -maxdepth 1 | ./find2flat.py - | ./unflatten.py - | ncdu -f - -0
thread 3733331 panic: attempt to unwrap error
Unwind information for `:0x1066156` was not available, trace may be incomplete

Aborted (core dumped)

When you diff what ncdu -o exports of the same directory, the JSON contains the emoji as a raw unicode codepoints while both find2flat and unflatten convert it to \ud83e\udde1. This is indeed what's making ncdu crash as manually editing the ncdu-export-generated JSON to change it back to a raw unicode codepoint does not crash.

Interestingly, other unicode codepoints such as ä (\u00e4) do work.

This might actually be a bug in ncdu, though it wouldn't trigger it on its own of course since it'd put the raw codepoints into the JSON.

Atemu commented 3 days ago

I figured out that json.dumps() makes this happen by default and you need to turn it off using the ensure_ascii = False parameter. I could not figure out how to apply this to unflatten.py yet.

Edit: I did figure it out, I just messed up in the implementation and python didn't scream at me... O.o