Open moreati opened 6 years ago
This is fantastic work! Docs or issue would be my preference :)
FWIW the bit that really swings me on cPickle needing to die is a python-dev mailing list thread from ~2002 era. I can't remember how I found that thread. I started by git blaming https://docs.python.org/2/library/pickle.html to find whoever wrote that warning at the top. If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
I'm going to need your snail mail address soon to start delivering kegs of beer, if you keep up the Dune puns I'm going to be bankrupted haha
Btw I saw Chopsticks has a super compact codec, I really liked the look of it, except that it invokes isinstance() constantly rather than only on a slow path.. ideally that dispatch could be constant time, with something like:
try:
return converters[type(obj)](obj)
except KeyError:
# slow path
But perhaps he discovered that repeat isinstance() is faster than generating a function call for every object in the graph
ideally that dispatch could be constant time, with something like
try: ... except KeyError:
funny you should mention that ...
Ah yes, one final thing to keep in mind: it'd be /so/ nice if any new solution gave us more latitude to make the library more user-friendly. I really wish we could just pickle the hell out of every user type and watch the world burn, but security concerns of course prevents that. Meanwhile the solution today is so annoyingly restrictive. I wish there was some middle ground that preserved safety and convenience. Zero clue what it would look like, though, but I'm certain the serializer sits front and center of any such improvement
All accounts suggest it cannot be used securely, however few of those accounts appear to be expert
I'm starting to get a sense of what you meant by this. Although
I squeezed some bytes and CPU cycles out of Chopsticks pencode()
. It's now closer to constant time dispatch https://github.com/lordmauve/chopsticks/pull/61
Champion! How was the before/after benchmark when you introduced all those function calls? I may check it out and try running it later, I'm a sucker for little microbenchmarks like this
On 17 March 2018 at 03:36, Alex Willmer notifications@github.com wrote:
I squeezed some bytes and CPU cycles out of Chopsticks pencode(). It's now closer to constant time dispatch lordmauve/chopsticks#61 https://github.com/lordmauve/chopsticks/pull/61
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373854818, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAJCzqlBdSVr6JbrExLmQkNEISu00IZks5tfDPmgaJpZM4SlX_K .
Each commit after benchmark_pencode.py has the time in the commit message, just shy of 50% reduction overall.
On 16 March 2018 at 21:55, dw notifications@github.com wrote:
Champion! How was the before/after benchmark when you introduced all those function calls? I may check it out and try running it later, I'm a sucker for little microbenchmarks like this
On 17 March 2018 at 03:36, Alex Willmer notifications@github.com wrote:
I squeezed some bytes and CPU cycles out of Chopsticks pencode(). It's now closer to constant time dispatch lordmauve/chopsticks#61 https://github.com/lordmauve/chopsticks/pull/61
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373854818, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAAJCzqlBdSVr6JbrExLmQkNEISu00IZks5tfDPmgaJpZM4SlX_K .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373855552, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKpclHbJBZKYo3LiZ9DtVAYpc38mo8Yks5tfDTRgaJpZM4SlX_K .
-- Alex Willmer alex@moreati.org.uk
funny you should mention that ...
Noting the date of the first commit, now this makes sense :P~
If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
2 commits appear to fit this bill
I've not found the thread yet.
I might have just been thinking about the associated bug -- https://bugs.python.org/issue471893 , it's pretty guru level already. But I could've sworn I saw mailman archive headers flicking past. Still digging, will comment here if I find it.
On 17 March 2018 at 04:15, Alex Willmer notifications@github.com wrote:
If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
2 commits appear to fit this bill
- Nov 2001 python/cpython@f595fd9 https://github.com/python/cpython/commit/f595fd975da6338ae7b5a5355833ee73d924ed20
- May 2003 python/cpython@7696344 https://github.com/python/cpython/commit/7696344182e959338851ec6de84af30cc6ac55d7
I've not found the thread yet.
PS: better pickle punage https://mail.python.org/pipermail/python-dev/2003-February/033320.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373862269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAJC5Q0JMxMtljHlUcv7eaDXrxksf_2ks5tfD0EgaJpZM4SlX_K .
https://www.python.org/dev/peps/pep-0307/#security-issues
On 17 March 2018 at 04:25, David Wilson dw@botanicus.net wrote:
I might have just been thinking about the associated bug -- https://bugs.python.org/issue471893 , it's pretty guru level already. But I could've sworn I saw mailman archive headers flicking past. Still digging, will comment here if I find it.
On 17 March 2018 at 04:15, Alex Willmer notifications@github.com wrote:
If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
2 commits appear to fit this bill
- Nov 2001 python/cpython@f595fd9 https://github.com/python/cpython/commit/f595fd975da6338ae7b5a5355833ee73d924ed20
- May 2003 python/cpython@7696344 https://github.com/python/cpython/commit/7696344182e959338851ec6de84af30cc6ac55d7
I've not found the thread yet.
PS: better pickle punage https://mail.python.org/pipermail/python-dev/2003-February/033320.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373862269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAJC5Q0JMxMtljHlUcv7eaDXrxksf_2ks5tfD0EgaJpZM4SlX_K .
FWIW it's not the presence of GvR on that PEP that gives me the heeby-jeebies, it's Tim Peters. You find him 2 years earlier on that bug almost implying that find_global (like we use today) is good enough. And that guy is the definition of smart, if something changed his mind, I'm not going to question why
On 17 March 2018 at 04:31, David Wilson dw@botanicus.net wrote:
https://www.python.org/dev/peps/pep-0307/#security-issues
On 17 March 2018 at 04:25, David Wilson dw@botanicus.net wrote:
I might have just been thinking about the associated bug -- https://bugs.python.org/issue471893 , it's pretty guru level already. But I could've sworn I saw mailman archive headers flicking past. Still digging, will comment here if I find it.
On 17 March 2018 at 04:15, Alex Willmer notifications@github.com wrote:
If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
2 commits appear to fit this bill
- Nov 2001 python/cpython@f595fd9 https://github.com/python/cpython/commit/f595fd975da6338ae7b5a5355833ee73d924ed20
- May 2003 python/cpython@7696344 https://github.com/python/cpython/commit/7696344182e959338851ec6de84af30cc6ac55d7
I've not found the thread yet.
PS: better pickle punage https://mail.python.org/pipermail/python-dev/2003-February/033320.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373862269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAJC5Q0JMxMtljHlUcv7eaDXrxksf_2ks5tfD0EgaJpZM4SlX_K .
I also investigating auditing cPickle (and only cPickle) directly. It's 5000 lines of C that interacts heavily with the runtime (e.g. copyreg). It could be done, but it's serious work
On 17 March 2018 at 04:33, David Wilson dw@botanicus.net wrote:
FWIW it's not the presence of GvR on that PEP that gives me the heeby-jeebies, it's Tim Peters. You find him 2 years earlier on that bug almost implying that find_global (like we use today) is good enough. And that guy is the definition of smart, if something changed his mind, I'm not going to question why
On 17 March 2018 at 04:31, David Wilson dw@botanicus.net wrote:
https://www.python.org/dev/peps/pep-0307/#security-issues
On 17 March 2018 at 04:25, David Wilson dw@botanicus.net wrote:
I might have just been thinking about the associated bug -- https://bugs.python.org/issue471893 , it's pretty guru level already. But I could've sworn I saw mailman archive headers flicking past. Still digging, will comment here if I find it.
On 17 March 2018 at 04:15, Alex Willmer notifications@github.com wrote:
If you follow the whole way back through the renames (and conversion from TeX to rst!!) you'll find a date and author. If you then visit the python-dev list archives around those months, there is a guru-level thread basically warning against even what we're doing today.
2 commits appear to fit this bill
- Nov 2001 python/cpython@f595fd9 https://github.com/python/cpython/commit/f595fd975da6338ae7b5a5355833ee73d924ed20
- May 2003 python/cpython@7696344 https://github.com/python/cpython/commit/7696344182e959338851ec6de84af30cc6ac55d7
I've not found the thread yet.
PS: better pickle punage https://mail.python.org/pipermail/python-dev/2003-February/033320.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dw/mitogen/issues/126#issuecomment-373862269, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAJC5Q0JMxMtljHlUcv7eaDXrxksf_2ks5tfD0EgaJpZM4SlX_K .
I might have just been thinking about the associated bug -- https://bugs.python.org/issue471893 , it's pretty guru level already
One of the concerns in issue 471893 is the use of eval()
to parse strings. This was fixed by Changeset ab63af, in Python 2.3 onward neither pickle
or cPickle
invoke eval()
directly.
Mitogen currently uses cPickle (with some filtering) to serialize data passed between parents and children. Pickle is widely regarded as risky - deserializing an arbitrary pickle from an untrusted source can trivially result in code execution. Mitogen has some protection is in-place already
This is a medium to long term undertaking. To assess the feasibility of replacing Pickle I plan to
@dw I can document my progress in this issue, as changes to the docs (in a branch/PR), or perhaps as a Wiki page. Do you have a preference?