ArchiveBox / ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
https://archivebox.io
MIT License
21.63k stars 1.15k forks source link

Bug: print causes the io error 9: Bad file descriptor when exporting static html #1269

Open sasasqt opened 11 months ago

sasasqt commented 11 months ago

Describe the bug

these commands

/home/user/.local/bin/archivebox list --html > /home/user/archivebox/index.html
/home/user/.local/bin/archivebox list --html

throw an error

print(output)

OSError: [Errno 9] Bad file descriptor

https://github.com/ArchiveBox/ArchiveBox/blob/f5739506f637734fa194b9bf7c54f01b1333b5a2/archivebox/main.py#L883C4-L883C4


I changed print(output) to

        with open('/home/user/archivebox/index.html','w') as file:
            file.write(output)

and the problem was resolved. i guess import sys; sys.stdout = open('stdout.txt', 'w') would also work.


i suspected that the stdout pipe got filled up causing the error and i can not reliably reproduce this error


std flush before print also seems to mitigate this problem


Env: my archivebox is on a low budget vps, and the archive folder is actually remote mounted google drive folder.

Steps to reproduce

Screenshots or log output

ArchiveBox version

replace this line with the *full*, unshortened output of running `archivebox version`
sasasqt commented 11 months ago

btw how to specify the latest snapshot first in --sort when outputing static html? --sort=timestamp does the opposite

pirate commented 11 months ago

I need the output of archivebox version to help, please edit your post and add it.

btw how to specify the latest snapshot first in --sort when outputing static html? --sort=timestamp does the opposite

use --sort=-timestamp