python / cpython

The Python programming language
https://www.python.org
Other
63.42k stars 30.37k forks source link

Expose a dictionary of interned strings in sys module #78567

Closed 35055fa0-fc4d-4752-82be-528e15f2df01 closed 6 years ago

35055fa0-fc4d-4752-82be-528e15f2df01 commented 6 years ago
BPO 34386
Nosy @rhettinger, @ronaldoussoren, @serhiy-storchaka, @rushter

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['interpreter-core', 'type-feature', '3.8'] title = 'Expose a dictionary of interned strings in sys module' updated_at = user = 'https://github.com/rushter' ``` bugs.python.org fields: ```python activity = actor = 'serhiy.storchaka' assignee = 'none' closed = True closed_date = closer = 'serhiy.storchaka' components = ['Interpreter Core'] creation = creator = 'rushter' dependencies = [] files = [] hgrepos = [] issue_num = 34386 keywords = [] message_count = 7.0 messages = ['323437', '323440', '323450', '323453', '323455', '323468', '323470'] nosy_count = 4.0 nosy_names = ['rhettinger', 'ronaldoussoren', 'serhiy.storchaka', 'rushter'] pr_nums = [] priority = 'normal' resolution = 'rejected' stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue34386' versions = ['Python 3.8'] ```

35055fa0-fc4d-4752-82be-528e15f2df01 commented 6 years ago

Python provides an ability to intern strings (sys.intern). It would be useful to expose a read-only dictionary of interned strings to the Python users so we can see what kind of strings are interned.

It takes minimal changes since internally it's just a dictionary. Is this worth adding to the sys module?

ronaldoussoren commented 6 years ago

Could you explain why this would be useful?

rhettinger commented 6 years ago

I wouldn't want a user to be able to mutate the dictionary directly (otherwise, non-strings could be added).

ronaldoussoren commented 6 years ago

Another reason for not wanting write access to the sys.intern dictionary is that this dictionary does not own references to its keys and values.

35055fa0-fc4d-4752-82be-528e15f2df01 commented 6 years ago

Thank you, I agree. I can't come up with practical use cases other than my curiosity.

Is it possible to somehow expose the dictionary in the debug build of Python? Currently, there is no way to access it from the interpreter even with ctypes.

ronaldoussoren commented 6 years ago

IMHO we shouldn't expose the intern dictionary without there being a clear, and good enough, use case for doing so.

Exposing the dictionary decreases implementation flexibility, and increases requirements on other implementations. One example of the former: at least in theory the interning dictionary could be a set, but we couldn't make that change if the dictionary were exposed in the API.

With current information I'm -1 on exposing the dictionary, and -0 on doing that for debug builds only.

serhiy-storchaka commented 6 years ago

I concur with Ronald. Using a dict instance is an implementation detail. CPython could use a dict, a set, a hashtable implementation from Modules/hashtable.c, or the HAMT implementation from Python/hamt.c. Other Python implementations can have other options.