lrq3000 / pyFileFixity

📂🛡️Suite of tools for file fixity (data protection for long term storage⌛) using redundant error correcting codes, hash auditing and duplications with majority vote, all in pure Python🐍
MIT License
129 stars 9 forks source link

question: `tqdm` progress display keeps lines on the screen #14

Open spock opened 10 months ago

spock commented 10 months ago

For all operations, there is a nice tqdm progress shown, but at least for me each update is printed as a new, separate line, instead of overwriting the same line over and over (which I believe is the default behavior of tqdm).

This results in lots of scrolling, pushing away more interesting messages (such as file date/checksum mismatches):

 99%|#########9| 48385/48771 [07:03<00:02, 179.60it/s]
 99%|#########9| 48403/48771 [07:03<00:02, 177.26it/s]
 99%|#########9| 48421/48771 [07:03<00:01, 175.20it/s]
 99%|#########9| 48439/48771 [07:03<00:01, 173.68it/s]
 99%|#########9| 48457/48771 [07:03<00:01, 173.15it/s]
 99%|#########9| 48475/48771 [07:03<00:01, 172.24it/s]
 99%|#########9| 48493/48771 [07:03<00:01, 172.36it/s]
 99%|#########9| 48511/48771 [07:04<00:01, 171.92it/s]

Is this behavior by design, or does it behave so only on my system?
pyFileFixity version 3.1.4 installed with pip, on Python 3.10.12, on WSL2.

lrq3000 commented 10 months ago

I guess this must be an issue with the terminal you use for WSL, try another one. This issue happens only when the terminal doesn't support the return character which allows to erase and replace characters. I used to work extensively on tqdm so i can debug these issues very well, normally tqdm is extremely robust, but it can't do anything if the terminal lacks the necessary features (that are standards but you know how Microsoft does not always follow them...).

Maybe try powershell as the host terminal?

12 nov. 2023 20:04:40 Bogdan @.***>:

For all operations, there is a nice tqdm progress shown, but at least for me each update is printed as a new, separate line, instead of overwriting the same line over and over (which I believe is the default behavior of tqdm).

This results in lots of scrolling, pushing away more interesting messages (such as file date/checksum mismatches):

  • 99%|#########9| 48385/48771 [07:03<00:02, 179.60it/s] 99%|#########9| 48403/48771 [07:03<00:02, 177.26it/s] 99%|#########9| 48421/48771 [07:03<00:01, 175.20it/s] 99%|#########9| 48439/48771 [07:03<00:01, 173.68it/s] 99%|#########9| 48457/48771 [07:03<00:01, 173.15it/s] 99%|#########9| 48475/48771 [07:03<00:01, 172.24it/s] 99%|#########9| 48493/48771 [07:03<00:01, 172.36it/s] 99%|#########9| 48511/48771 [07:04<00:01, 171.92it/s]
  • Is this behavior by design, or does it behave so only on my system? pyFileFixity version 3.1.4 installed with pip, on Python 3.10.12, on WSL2.

— Reply to this email directly, view it on GitHub[https://github.com/lrq3000/pyFileFixity/issues/14], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAIRFXX3UWIAIWGOSI7Z4SDYEEMURAVCNFSM6AAAAAA7IGMCKOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4DSNJSHA2DQMQ]. You are receiving this because you are subscribed to this thread. [Image de pistage][https://github.com/notifications/beacon/AAIRFXXT5O3EKCZTXC7WI43YEEMURA5CNFSM6AAAAAA7IGMCKOWGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHHNFOLUI.gif]

spock commented 10 months ago

I'm using Windows Terminal, and trying pff in cmd, powershell, and git bash had exactly the same behavior as in WSL2 above.

I did find a number of issues mentioning tqdm and windows together, but none quite the same.

Strangely, running this example code in ipython3 (under WSL2) works as expected:

image

One more strange observation: if I specify --log option, then log files also has multiple tqdm lines, with clear newlines (as shown by less):

^M  1%|          | 244/48771 [00:11<11:51, 68.16it/s]
^M  1%|          | 252/48771 [00:11<11:20, 71.30it/s]

Wait, eureka? I see that in lib/tee.py on line 37 (def write), you have end = "\n".
And ptee object is given to tqdm as a "file-like that supports write method".
However, normal write does NOT auto-append "\n"; could this be the problem?

If I do replace that "\n" with "", then tqdm status line looks/works exactly as it should... but of course all the other output is now mangled, because you expect an automatic newline with each ptee.write().

Actually, shouldn't tqdm write ONLY to stdout? It makes no sense to write that to the log.

Indeed, replacing tqdm(..., file=ptee, ...) with tqdm(..., file=stdout, ...) in rfigs.py DOES fix the problem, at least as long as I don't specify --log (which redirects stdout and thus breaks output).

I'll probably give a "native Linux" pff a try tomorrow, I don't believe it works there with def write(..., end="\n", ...).

lrq3000 commented 10 months ago

You're right, the output of tqdm shouldn't be sent to ptee. I used to do that in the past to also keep a trace of the progress bar (can be very useful to know where a crash or interruption happened), but tqdm is not made for this purpose indeed, so I guess it's safer to redirect to stdout and not ptee.

So yes if you do a PR I will happily merge it, and I will also apply the same to other instances of tqdm in other commands.

Thank you for your very valuable and detailed feedbacks!

Le lun. 13 nov. 2023 à 00:59, Bogdan @.***> a écrit :

I'm using Windows Terminal, and trying pff in cmd, powershell, and git bash had exactly the same behavior as in WSL2 above.

I did find a number of issues mentioning tqdm and windows together, but none quite the same.

Strangely, running this example code in ipython3 (under WSL2) works as expected:

[image: image] https://user-images.githubusercontent.com/584494/282331741-279ded9c-70c7-4496-8fa8-af07d4750991.png

One more strange observation: if I specify --log option, then log files also has multiple tqdm lines, with clear newlines (as shown by less):

^M 1%| | 244/48771 [00:11<11:51, 68.16it/s] ^M 1%| | 252/48771 [00:11<11:20, 71.30it/s]

Wait, eureka? I see that in lib/tee.py on line 37 (def write), you have end = "\n". And ptee object is given to tqdm as a "file-like that supports write method". However, normal write does NOT auto-append "\n"; could this be the problem?

If I do replace that "\n" with "", then tqdm status line looks/works exactly as it should... but of course all the other output is now mangled, because you expect an automatic newline with each ptee.write().

Actually, shouldn't tqdm write ONLY to stdout? It makes no sense to write that to the log.

Indeed, replacing tqdm(..., file=ptee, ...) with tqdm(..., file=stdout, ...) in rfigs.py DOES fix the problem, at least as long as I don't specify --log (which redirects stdout and thus breaks output).

I'll probably give a "native Linux" pff a try tomorrow, I don't believe it works there with def write(..., end="\n", ...).

— Reply to this email directly, view it on GitHub https://github.com/lrq3000/pyFileFixity/issues/14#issuecomment-1807288481, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIRFXUI4V2FP6JN5BJH7GTYEFPHHAVCNFSM6AAAAAA7IGMCKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXGI4DQNBYGE . You are receiving this because you commented.Message ID: @.***>