Open info-sic opened 4 months ago
Update: I can reproduce the error with a multisheet xlsx file (that I can't publish). I can open this file with libreoffice and export it to pdf fine there. To me it looks like the timeout in pandora might be too short and that somehow crashes unoserver/libreoffice? So one workaround might be, to check the status of the socket after an analysis, to see wether libreoffice is still reachable? Just my 2ct.
Can you increase the timeout value in the settings of the preview worker?
Update I'll try. And I failed. I didn't find the timeout parameter for preview.py
I could reproduce the timeout with pandora.circl.lu 0d92c168-4c01-465c-8fd9-08b56f359abb You can check the file yourself. I politely ask you to srm it afterwards and keep all information confidential. Manu
Quick update on that: the file is huge (30+Mb), and generating the PDF in fact works. What fails is creating the images put of the exported PDF. I'm increasing the timeout until it work, but I'm not sure how practical it is.
That's worth a try. But anyways, the bigger problem is, that preview doesn't work at all for following uploads after it ever crashes. So a check and restart might be more important. But thx a lot, as always. Manu
yeah, seems it causes libreoffice processes to get stuck.
I might get things completely wrong but does pandora first convert office files to pdf and than produces a png for the preview? LibreOffice is capable of exporting png directly. At least from the UI ... I have no clue wether that's scriptable, though.
The problem with that approach was (I need to try it again) that the filenames weren't something I could set myself so they couldn't easily be seen on the web interface without reprocessing them individually. And We want the PDF export anyway, so this approach was more efficient.
But yes, as of now, it breaks the preview generator, I'll investigate.
In the meantime, I got it to work by adding the following in pandora/workers/preview.yml
settings:
cache: 1h
timeout: 30m
The settings did not help with this specific xlsx, timed out after 30min with the same crash:
File "/usr/lib/python3.10/socket.py", line 705, in readinto
return self._sock.recv_into(b)
File "/home/pandora/pandora/pandora/workers/base.py", line 102, in _raise_timeout
raise TimeoutError
while soffice.bin was stuck with 100%CPU, so I'm quite sure, the next preview will fail also. Will tell you in 24 minutes ;)
just making sure, you restarted the workers? The settings will not be taken in account otherwise.
I sure did, otherwise the timeout/crash would have occured much faster...
I assume, that it is beyond the scope of this project to fix soffice errors, so focusing on restarting the preview-capabilities after unavoidable crashes might be a better option? Maybe https://github.com/pandora-analysis/pandora/issues/187 could be done simultaniously? Because I'd really like to see that too for the workers ;) If the user doesn't get feedback for a longer time, he/she will eventually give up.
Yeah, I can see that, but the bar will not be doing anything more than moving somewhat randomly up to toe time it is done and the interface doesn't tell you to wait anymore.
Never underestimate the power of Microsoft-minutes (between 2s and 2h long) and some random moving patterns to keep users on a site. As long as we don't use RT-Os, a user-agent will never know, when the job is done. Btw, eithin the libreoffice UI, it took about 20sec to convert this particular xlsx.
Aloha, my pandora-instance gets rebootet every night and starts with systemd. The start-exec includes an poetry run update --yes
Because of unknown reasons preview fails randomly (see also https://github.com/pandora-analysis/pandora/issues/93)
Error disappears after systemctl stop/start pandora. Any ideas for a workaround? Manu