OPM / opm-reference-manual

Other
1 stars 5 forks source link

Python-UNO bridge to libreoffice inside docker container. #108

Closed hakonhagland closed 7 months ago

hakonhagland commented 7 months ago

Added a Python-UNO bridge to LibreOffice running inside a docker container. This can be used to automatically update indices. And also hopefully do other things that we want to automate in the future, like saving the PDF file.

blattms commented 7 months ago

Thanks a lot.

Is docker/lo-ubuntu2204/poetry.lock part of the PR for a reason or by accident?

Seems like we are using the hard coded port 2022, but it is probably unlikely that this used.

hakonhagland commented 7 months ago

Seems like we are using the hard coded port 2022

@blattms Yes. Good comment, we should get rid of the hard coded port numbers. I have updated this PR to include dynamic port handling through environment variables.

hakonhagland commented 7 months ago

Is docker/lo-ubuntu2204/poetry.lock part of the PR for a reason or by accident?

@blattms I followed the practice described here: https://python-poetry.org/docs/basic-usage/. Quote: "You should commit the poetry.lock file to your project repo so that all people working on the project are locked to the same versions of dependencies"

blattms commented 7 months ago

thanks a lot. I would merge this as is. @lisajulia if you would you take a short look at the python code, that would be really cool. Feel free to merge after that.

hakonhagland commented 7 months ago

I did use it on another machine. Did execute build-image.sh but with either start-container.sh or /docker-soffice.sh I get:

See 'docker run --help'.

@blattms I will test this on a Windows and Mac machine to see if there are any issues.

hakonhagland commented 7 months ago

I did use it on another machine. Did execute build-image.sh but with either start-container.sh or /docker-soffice.sh I get:

See 'docker run --help'.

@blattms I tested the docker-soffice.sh and start-container.sh on Windows and macOS, and they did not work out of the box. I have now added two PowerShell scripts docker-soffice.ps1 and start-container.ps1 that should be run on Windows instead of the shell scripts. I also modified the shell script such that they work on macOS. See the last two commits for details. Can you try again with these new scripts?

hakonhagland commented 7 months ago

Where can I find instruction on how to use our docker stuff.

@blattms It is in the README.md file. I have updated it with more information, see latest commit. Let me known if something is still missing

blattms commented 7 months ago

After fixing the above, I see "this file has been locked by another user"

blattms commented 7 months ago

~I still see the "Update indices ..." dialog no matter whether I open the document read-only or as a copy.~

The above sentence was wrong. It is the dialog about links to external data that I am still seeing.

blattms commented 7 months ago

Should the indices be updated automatically, now? On my system they are not updated.

hakonhagland commented 7 months ago

It is the dialog about links to external data that I am still seeing.

@blattms Interesting. This dialog should be gone now due to the modification of the config file done by the call to ./update-libreoffice-config.sh at line 48 in start.h, see: https://github.com/OPM/opm-reference-manual/blob/43d556450671fd5059a3edf399e9d05a1aeb028c/docker/lo-ubuntu2204/container/start.sh#L48

hakonhagland commented 7 months ago

This dialog should be gone now due to the modification of the config file done by the call to ./update-libreoffice-config.sh

@blattms Sorry, I thought you were running the start-container.sh script. If you are running the docker-soffice.sh the dialog is not removed. It is only when you run start-container.sh that the dialog will be removed and the index will be automatically updated.

hakonhagland commented 7 months ago

Should the indices be updated automatically, now?

@blattms If you are running docker-soffice.sh they should not be updated manually. However, if you run start-container.sh (and then the python script lodocker-open-file main.fodt) they should. I have added a comment in docker-soffice.sh to clarify this. See latest commit.

blattms commented 7 months ago

thanks. After I learned how to reinstall e.g. open_file.py, the dialog is indeed gone when using start-container.py. Indices are update automatically, which is very cool. There is a small caveaT. For the very last entries (A.10 SAVE FILES), the page numbers are not correct. Probably a race condition between rendering (which is not 100% finished) and updating the index. I guess this is tiny and we can live with that.

I still see a dialog about the locked libreoffice file, though. Do you see that, too? Maybe this is remnant of a killed liberoffice on my system? Or is it intended to prevent users from saving?

hakonhagland commented 7 months ago

I still see a dialog about the locked libreoffice file, though. Do you see that, too? Maybe this is remnant of a killed liberoffice on my system?

@blattms Yes, I think I have seen those files, they look like this: .~lock.main.fodt#, right? I think they are generated when libreoffice is opened with that file main.fodt and it should be deleted when libreoffice closes that file. However, if you kill libreoffice then the file may be left behind and not deleted. And next time you open libreoffice with that file you get that warning popup dialog.

hakonhagland commented 7 months ago

For the very last entries (A.10 SAVE FILES), the page numbers are not correct.

@blattms Do you get correct page numbers when you open main.fodt outside the docker container, and manually update the index? I am not able to open main.fodt outside the docker container (see https://github.com/OPM/opm-reference-manual/issues/91) so I cannot check myself

blattms commented 7 months ago

It seems sufficient to test using docker-soffice.shh and updating the index manual (I have todo save-as before as the file is readonly). Yes doing manually the page number displayed is the one in the footer if we jump to that page.

blattms commented 7 months ago

I still see a dialog about the locked libreoffice file, though. Do you see that, too? Maybe this is remnant of a killed liberoffice on my system?

@blattms Yes, I think I have seen those files, they look like this: .~lock.main.fodt#, right? I think they are generated when libreoffice is opened with that file main.fodt and it should be deleted when libreoffice closes that file. However, if you kill libreoffice then the file may be left behind and not deleted. And next time you open libreoffice with that file you get that warning popup dialog.

The problem is not only the dialog. You also cannot change the document (e.g. update the index).

Is there a way to remove the lock file inside docker?

hakonhagland commented 7 months ago

Is there a way to remove the lock file inside docker?

@blattms Not sure. Since those files are actually on the host not inside the container, see line 73 in start-container.sh: https://github.com/OPM/opm-reference-manual/blob/152863db2b8f5083d9f0bc843e72e8225b14c4b0/docker/lo-ubuntu2204/start-container.sh#L73

where we mount the host directory into the container. I think those files needs to be removed on the host manually (or by a script) before we open a file. The problem with using a script is that the script should not remove a lock file if libreoffice is actually open with that file. So the script needs to determine if libreoffice is using the lock file or not.

hakonhagland commented 7 months ago

For the very last entries (A.10 SAVE FILES), the page numbers are not correct. Probably a race condition between rendering (which is not 100% finished) and updating the index.

@blattms It should not be a race condition since thedesktop.loadComponentFromURL(...) should not return before the document has been completely loaded, see line 50: https://github.com/OPM/opm-reference-manual/blob/152863db2b8f5083d9f0bc843e72e8225b14c4b0/docker/lo-ubuntu2204/container/docker-server.py#L50

The index is then updated at line 52 after the document has been loaded. I will do some more tests on this.

hakonhagland commented 7 months ago

Added new commit to allow the docker server to only load the files (without updating the index at the same time). Using this newly added function to only load the document and then manually updating the index afterwards, seems to give correct page number in the index. So it seems there is something going on (a bug?) with the PyUNO call index.update(), see https://github.com/OPM/opm-reference-manual/blob/9656399a15019333f98d11b7fde3b017ba52bfe9/docker/lo-ubuntu2204/container/docker-server.py#L23

blattms commented 7 months ago

Thanks a lot for the additions and the infos.