getpelican / pelican

Static site generator that supports Markdown and reST syntax. Powered by Python.
https://getpelican.com
GNU Affero General Public License v3.0
12.61k stars 1.81k forks source link

Running `pelican -r -l` produces Pickle-related error #3394

Closed olivierverdier closed 2 months ago

olivierverdier commented 2 months ago

Issue

How to reproduce

Preliminary steps

With the very last pelican version (coming from github), create a new blog with pelican-quick-start. Go in the newly created site and run ~make devserver~ pelican -lr.

Outcome

Error coming from a pickle error, but which looks like:

[11:02:20] CRITICAL AttributeError: 'NoneType' object has no attribute        __init__.py:683
                    'terminate'                                                              

Explanation

The problem is that the process p1 created in __init__.main contains an object of type RichHandler which can not get pickled.

~Solution~ Workarounds

Use pelican -lr --log-handler=plain until the problem is fixed.

If you call pelican -lr from make devserver, modify the Makefile by specifying a plain log hander, for instance by modifying the line containing PELICANOPTS:

PELICANOPTS=--log-handler plain
justinmayer commented 2 months ago

Upon further review, to be fair to @ben-n93, issue #3397 refers to running pelican -r -l, which is where the problem actually originates. The Makefile changes suggested in this issue only resolve the problem at the make devserver wrapper level — not at the level of the underlying pelican -r -l command that said wrapper invokes. So the Makefile change is really only a useful workaround for folks who use the Make automation, and the problem should really instead be fixed where it is actually occurring.

justinmayer commented 2 months ago

@cpitclaudel: It would seem that your PR #3293 is related to the new error reported in this issue. This error appears when running pelican -r -l. Could you take a look and see whether you can find a way to resolve this issue?

olivierverdier commented 2 months ago

Sure, maybe the title should be changed to "Pickling error with pelican -lr" instead.

The workaround I suggest is, generally, to change the log handler in order to solve the problem, and that is by adding the extra option --log-handler=plain.

So, pelican -lr --log-handler=plain is another way to solve the problem.

justinmayer commented 2 months ago

From my perspective, appending --log-handler=plain to pelican -l -r is a temporary workaround — not a solution — because the changes in the aforementioned PR have broken backwards-compatibility (presumably inadvertently), which was not clear at the time the PR was merged. This should still be fixed such that invoking pelican -l -r does not produce an error.

olivierverdier commented 2 months ago

You are right, it is a workaround.

One better fix is to remove the log_handler attribute of the args object right after calling init_logging in the main function in __init__.py. Something like

delattr(args, "log_handler")

I'm not saying it's a complete solution, but it works, and you can call pelican -lr now without having to change the log_handler.

cpitclaudel commented 2 months ago

Hi all, sorry to hear about the trouble.

@olivierverdier I tried to follow the repro, but I must have missed something:

$ mkdir tmp; cd tmp
$ git clone git@github.com:getpelican/pelican.git
Cloning into 'pelican'...
remote: Enumerating objects: 24281, done.        
remote: Counting objects: 100% (13/13), done.        
remote: Compressing objects: 100% (13/13), done.        
remote: Total 24281 (delta 4), reused 3 (delta 0), pack-reused 24268 (from 1)        
Receiving objects: 100% (24281/24281), 7.16 MiB | 7.74 MiB/s, done.
Resolving deltas: 100% (16349/16349), done.
$ cd pelican/
$ python3 -m venv .venv
$ source .venv/bin/activate 
$ pip install invoke
Collecting invoke
  Using cached invoke-2.2.0-py3-none-any.whl.metadata (3.3 kB)
Using cached invoke-2.2.0-py3-none-any.whl (160 kB)
Installing collected packages: invoke
Successfully installed invoke-2.2.0
$ invoke setup
[...]
$ mkdir tmp
$ cd tmp/
$ python3 -c "import pelican; print(pelican.__file__)"
<redacted>/tmp/pelican/pelican/__init__.py
$ pelican-quickstart
[..]
$ which pelican
<redacted>/tmp/pelican/.venv/bin/pelican
$ pelican -lr
  --- AutoReload Mode: Monitoring `content`, `theme` and `settings` for changes.
---
Serving site at: http://127.0.0.1:8000 - Tap CTRL-C to stop
Done: Processed 0 articles, 0 drafts, 0 hidden articles, 0 pages, 0 hidden pages
and 0 draft pages in 0.03 seconds.
[16:31:11] WARNING  Unable to watch path                            utils.py:843
                    '<redacted>/tmp/pelican/tmp/content/images'              
                    as it does not exist.                                       
  C-c C-c[16:31:17] WARNING  Keyboard interrupt received. Exiting.        __init__.py:681

What am I missing?

justinmayer commented 2 months ago

@cpitclaudel: I can reproduce the error every time. Just to make sure, I followed the exact set of steps you listed, and I still get the error. (Python 3.12.6 on macOS Sonoma 14.7)

olivierverdier commented 2 months ago

Well, the step you printed out above are exactly what is needed to reproduce. 😅

Maybe it's OS related? An issue with MacOS perhaps?

cpitclaudel commented 2 months ago

Fascinating. I'm on Ubuntu 24.04. Any chance you might share a backtrace?

cpitclaudel commented 2 months ago

I suppose if the problem really is pickling then the following patch should help? https://github.com/getpelican/pelican/compare/main...cpitclaudel:pelican:cpc/formatter-pickle

olivierverdier commented 2 months ago

Yes, your patch works for me.

avaris commented 2 months ago

I suppose if the problem really is pickling then the following patch should help? main...cpitclaudel:pelican:cpc/formatter-pickle

That would likely do the trick.

And I think it is related to how multiprocessing is handled in python (e.g. spawn vs fork), so it is somewhat OS related. By default, linux is fork, macos and win is spawn. Without going much into detail, this should affect the spawn method.

olivierverdier commented 2 months ago

If anyone is interested in the traceback, here it is:

╭──────────────────────────── Traceback (most recent call last) ────────────────────────────╮
│ _____________________________/pelican/pelican/__init__.py:663 in main                     │
│                                                                                           │
│   660 │   │   │   │   ),                                                                  │
│   661 │   │   │   )                                                                       │
│   662 │   │   │   try:                                                                    │
│ ❱ 663 │   │   │   │   p1.start()                                                          │
│   664 │   │   │   │   p2.start()                                                          │
│   665 │   │   │   │   exc = excqueue.get()                                                │
│   666 │   │   │   │   if exc is not None:                                                 │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/process.py:121 in start                                           │
│                                                                                           │
│   118 │   │   assert not _current_process._config.get('daemon'), \                        │
│   119 │   │   │      'daemonic processes are not allowed to have children'                │
│   120 │   │   _cleanup()                                                                  │
│ ❱ 121 │   │   self._popen = self._Popen(self)                                             │
│   122 │   │   self._sentinel = self._popen.sentinel                                       │
│   123 │   │   # Avoid a refcycle if the target function holds an indirect                 │
│   124 │   │   # reference to the process object (see bpo-30775)                           │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/context.py:224 in _Popen                                          │
│                                                                                           │
│   221 │   _start_method = None                                                            │
│   222 │   @staticmethod                                                                   │
│   223 │   def _Popen(process_obj):                                                        │
│ ❱ 224 │   │   return _default_context.get_context().Process._Popen(process_obj)           │
│   225 │                                                                                   │
│   226 │   @staticmethod                                                                   │
│   227 │   def _after_fork():                                                              │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/context.py:289 in _Popen                                          │
│                                                                                           │
│   286 │   │   @staticmethod                                                               │
│   287 │   │   def _Popen(process_obj):                                                    │
│   288 │   │   │   from .popen_spawn_posix import Popen                                    │
│ ❱ 289 │   │   │   return Popen(process_obj)                                               │
│   290 │   │                                                                               │
│   291 │   │   @staticmethod                                                               │
│   292 │   │   def _after_fork():                                                          │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/popen_spawn_posix.py:32 in __init__                               │
│                                                                                           │
│   29 │                                                                                    │
│   30 │   def __init__(self, process_obj):                                                 │
│   31 │   │   self._fds = []                                                               │
│ ❱ 32 │   │   super().__init__(process_obj)                                                │
│   33 │                                                                                    │
│   34 │   def duplicate_for_child(self, fd):                                               │
│   35 │   │   self._fds.append(fd)                                                         │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/popen_fork.py:19 in __init__                                      │
│                                                                                           │
│   16 │   │   util._flush_std_streams()                                                    │
│   17 │   │   self.returncode = None                                                       │
│   18 │   │   self.finalizer = None                                                        │
│ ❱ 19 │   │   self._launch(process_obj)                                                    │
│   20 │                                                                                    │
│   21 │   def duplicate_for_child(self, fd):                                               │
│   22 │   │   return fd                                                                    │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/popen_spawn_posix.py:47 in _launch                                │
│                                                                                           │
│   44 │   │   set_spawning_popen(self)                                                     │
│   45 │   │   try:                                                                         │
│   46 │   │   │   reduction.dump(prep_data, fp)                                            │
│ ❱ 47 │   │   │   reduction.dump(process_obj, fp)                                          │
│   48 │   │   finally:                                                                     │
│   49 │   │   │   set_spawning_popen(None)                                                 │
│   50                                                                                      │
│                                                                                           │
│ _________________________________________________________________________________/lib/pyt │
│ hon3.12/multiprocessing/reduction.py:60 in dump                                           │
│                                                                                           │
│    57                                                                                     │
│    58 def dump(obj, file, protocol=None):                                                 │
│    59 │   '''Replacement for pickle.dump() using ForkingPickler.'''                       │
│ ❱  60 │   ForkingPickler(file, protocol).dump(obj)                                        │
│    61                                                                                     │
│    62 #                                                                                   │
│    63 # Platform specific definitions                                                     │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: cannot pickle '_thread.RLock' object
cpitclaudel commented 2 months ago

Thanks a lot for testing. I converted the patch to a pull request.