cschwan / sage-on-gentoo

(Unofficial) Gentoo Overlay for Sage- and Sage-related ebuilds
79 stars 26 forks source link

LOCKERROR when building the docs in Prefix #501

Closed strogdon closed 3 years ago

strogdon commented 6 years ago

This is just to document the issue since I've seen it several times. It occurs just as the docs are being built. Perhaps parallel build related.

[dynamics ] Exception occurred:
[dynamics ] File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/cbook/__init__.py", line 2482, in __enter__
[dynamics ] raise self.TimeoutError(err_str)
[dynamics ] TimeoutError: LOCKERROR: matplotlib is trying to acquire the lock
[dynamics ] u'/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/temp/matplotlib/.matplotlib_lock-*'
[dynamics ] and has failed.  This maybe due to any other process holding this
[dynamics ] lock.  If you are sure no other matplotlib process is running try
[dynamics ] removing these folders and trying again.
[dynamics ] The full traceback has been saved in /storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/temp/sphinx-err-5Y7Dya.log, if you want to report the issue to the developers.
[dynamics ] Please also report this if it was a user error, so that a better error message can be provided next time.
[dynamics ] A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
[algebras ] reading sources... [  8%] sage/algebras/catalog
[misc     ] reading sources... [  7%] sage/ext/memory
[homology ] reading sources... [ 38%] sage/homology/delta_complex
[misc     ] reading sources... [  8%] sage/media/wav
[algebras ] reading sources... [ 10%] sage/algebras/clifford_algebra
Error building the documentation.
Traceback (most recent call last):
  File "sage_setup/docbuild/__main__.py", line 2, in <module>
    main()
  File "/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/work/sage-9999/src-python2_7/sage_setup/docbuild/__init__.py", line 1675, in main
    builder()
  File "/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/work/sage-9999/src-python2_7/sage_setup/docbuild/__init__.py", line 310, in _wrapper
    getattr(get_builder(document), 'inventory')(*args, **kwds)
  File "/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/work/sage-9999/src-python2_7/sage_setup/docbuild/__init__.py", line 505, in _wrapper
    build_many(build_ref_doc, L)
  File "/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/work/sage-9999/src-python2_7/sage_setup/docbuild/__init__.py", line 246, in build_many
    ret = x.get(99999)
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
OSError: [dynamics ] Exception occurred:

The lock folder

ls -l /storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/temp/matplotlib/.matplotlib_lock-608/
total 0

And the referenced sphinx-err-5Y7Dya.log

# Sphinx version: 1.6.3
# Python version: 2.7.14 (CPython)
# Docutils version: 0.14 
# Jinja2 version: 2.10
# Last messages:

# Loaded extensions:
Traceback (most recent call last):
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/sphinx/cmdline.py", line 305, in main
    opts.warningiserror, opts.tags, opts.verbosity, opts.jobs)
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/sphinx/application.py", line 196, in __init__
    self.setup_extension(extension)
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/sphinx/application.py", line 456, in setup_extension
    self.registry.load_extension(self, extname)
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/sphinx/registry.py", line 196, in load_extension
    mod = __import__(extname, None, None, ['setup'])
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/sphinxext/plot_directive.py", line 178, in <module>
    import matplotlib.pyplot as plt
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/pyplot.py", line 29, in <module>
    import matplotlib.colorbar
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/colorbar.py", line 36, in <module>
    import matplotlib.contour as contour
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/contour.py", line 21, in <module>
    import matplotlib.font_manager as font_manager
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/font_manager.py", line 1454, in <module>
    _rebuild()
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/font_manager.py", line 1437, in _rebuild
    with cbook.Locked(cachedir):
  File "/storage/strogdon/gentoo-rap/usr/lib64/python2.7/site-packages/matplotlib/cbook/__init__.py", line 2482, in __enter__
    raise self.TimeoutError(err_str)
TimeoutError: LOCKERROR: matplotlib is trying to acquire the lock
    u'/storage/strogdon/gentoo-rap/var/tmp/portage/sci-mathematics/sage-9999/temp/matplotlib/.matplotlib_lock-*'
and has failed.  This maybe due to any other process holding this
lock.  If you are sure no other matplotlib process is running try
removing these folders and trying again.
kiwifb commented 6 years ago

Hum... I think I know what lock we are talking about and I thought Jeroen had removed it after some other work I have done. May be it is not in 8.2.beta6, I'll have to dig.

kiwifb commented 6 years ago

No I am mistaken with the doctest forker. Has it appeared after the upgrade to MPL-2.1.0 or what it happening before?

kiwifb commented 6 years ago

It is definitely a parallel build issue that's likely needing to be dealt with upstream. How many threads/tasks?

strogdon commented 6 years ago

This happened after the upgrade to 8.2.beta6. So therefore MPL-2.1.0 was already there. However I noticed that I'm using a MPL-2.1.0 from my local overlay, but I don't see any potential parallel build changes between my local overlay matplotlib and that in the main tree. I should upgrade. I have 12 threads here. The failure is fairly random and I haven't been able to get it to fail twice in a row.

strogdon commented 6 years ago

I got this LOCKERROR again, same Prefix, when building 8.3.beta7. Restarting the build was then successful.

strogdon commented 6 years ago

Actually, I get this LOCKERROR on almost every rebuild of 8.3.beta7 in Prefix. This is new. When things fail I usually can continue building the html-docs by removing the matplotlib lock folder and then ebuild sage-9999 install.

kiwifb commented 6 years ago

Hum, any zombie processes around?

strogdon commented 6 years ago

Built with MAKEOPTS=-j10 instead of the usual MAKEOPTS=-j13.

kiwifb commented 6 years ago

Is this still happening?

strogdon commented 6 years ago

Occasionlly, yes. It is random, but rare.

strogdon commented 5 years ago

Just for info, I haven't seen this LOCKERROR in some time. I'm not sure when I stopped seeing it.

strogdon commented 5 years ago

This has now appeared again building 8.9.beta8. The procedure in the above comment was used to continue the build with the exception that the matplotlib lock folder was not removed. Apparently, removing it is not needed?

kiwifb commented 5 years ago

I am really clueless with regards to this issue. What kind of file systems do you have?

strogdon commented 5 years ago

OS is debian and the file system doesn't appear to be odd

Filesystem                                            Type      Size  Used Avail Use% Mounted on
udev                                                  devtmpfs   10M     0   10M   0% /dev
tmpfs                                                 tmpfs     3.2G  676K  3.2G   1% /run
/dev/mapper/blitzen-root                              ext4      427G  197G  208G  49% /
tmpfs                                                 tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs                                                 tmpfs     6.3G     0  6.3G   0% /run/shm
/dev/sda2                                             ext2      229M   21M  197M  10% /boot
/dev/sda1                                             vfat      487M  128K  486M   1% /boot/efi

I'm wondering what is creating the lock folder since it doesn't appear to be used.

strogdon commented 5 years ago

I'm wondering if https://github.com/matplotlib/matplotlib/pull/10596 may not be the solution. It appears to be in matplotlib 3.0. However, it was prompted by the lock folder associated with tex.cache. The lock folder here is not under the tex.cache folder

ls -al temp/matplotlib/
total 136
drwxr-xr-x 4 strogdon math   4096 Aug 26 14:58 .
drwxr-xr-x 5 strogdon math   4096 Aug 26 16:29 ..
drwxr-xr-x 2 strogdon math   4096 Aug 26 14:58 .matplotlib_lock-4729
-rw-r--r-- 1 strogdon math 121270 Aug 26 14:58 fontList.json
drwxr-xr-x 2 strogdon math   4096 Aug 26 14:58 tex.cache

so it may be different.

kiwifb commented 5 years ago

Interesting but it relies on python3 features. So no backporting. On the other hand we may very well just ditch python2 in sage-on-gentoo by the end of the year. I don't think it will stay for long in the main tree after support end at the end of the year.

strogdon commented 4 years ago

I haven't see this issue in some time. However, the last time I noted this within days it appeared again. Perhaps, if this doesn't appear after the release of 9.1 it should be closed?

kiwifb commented 4 years ago

We'll probably have some problem by then. I currently cannot build the doc on the vbraun branch and I don't know why. The obvious culprit isn't.

kiwifb commented 4 years ago

Well, the obvious culprit was guilty in the end. Even if I don't know why.

kiwifb commented 3 years ago

Closing old.