python / cpython

The Python programming language
https://www.python.org
Other
62.39k stars 29.96k forks source link

defaultdict.fromkeys should accept a callable factory #67561

Closed 6c3d51b7-8cfb-4bc0-8909-8739a0c45907 closed 8 years ago

6c3d51b7-8cfb-4bc0-8909-8739a0c45907 commented 9 years ago
BPO 23372
Nosy @rhettinger, @allenap, @serhiy-storchaka, @vedgar

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/rhettinger' closed_at = created_at = labels = ['type-bug', 'library'] title = 'defaultdict.fromkeys should accept a callable factory' updated_at = user = 'https://bugs.python.org/justanr' ``` bugs.python.org fields: ```python activity = actor = 'rhettinger' assignee = 'rhettinger' closed = True closed_date = closer = 'rhettinger' components = ['Library (Lib)'] creation = creator = 'justanr' dependencies = [] files = [] hgrepos = [] issue_num = 23372 keywords = [] message_count = 9.0 messages = ['235180', '235181', '235187', '235191', '235193', '235196', '276981', '276997', '277000'] nosy_count = 6.0 nosy_names = ['rhettinger', 'allenap', 'SilentGhost', 'serhiy.storchaka', 'veky', 'justanr'] pr_nums = [] priority = 'normal' resolution = 'rejected' stage = None status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue23372' versions = ['Python 3.5'] ```

6c3d51b7-8cfb-4bc0-8909-8739a0c45907 commented 9 years ago

Not something I've noticed until today, but defaultdict.fromkeys only accepts an iterable of keys and an initial value. To set the default_factory for the newly created collection, you need to prod at the attribute itself.

This isn't a huge issue for me, however adding an optional default_factory to fromkeys would be a nice convenience.

8977102d-e623-4805-b309-cd0ad35b0d72 commented 9 years ago

Correct me if I'm wrong but this seem as a very unlikely use case. Why would you need to ensure content of the defaultdict (i.e., why would you ever use its fromkeys method)? And, perhaps, having to implicitly assign default factory is not such a bad tradeoff, considering implementation overhead.

rhettinger commented 9 years ago

-0 I could see some use cases for this, dict.fromkeys(contestants, factory=list), but it adds complexity to a core container API and it starts to overlap the use cases for collections.defaultdict().

Perhaps set comprehensions should remain the one-obvious-way-to-do-it: {k : [] for k in contestants}

8977102d-e623-4805-b309-cd0ad35b0d72 commented 9 years ago

Raymond, but Alec talks about

  defaultdict.fromkeys(constants, factory=list)

as opposed to current solution

  d = defaultdict.fromkeys(constants)
  d.default_factory = list
  for i in d:
      d[i] = []

I wouldn't think that the dict.fromkeys should be affected by this issue.

6c3d51b7-8cfb-4bc0-8909-8739a0c45907 commented 9 years ago

@SilentGhost I was playing with a voting algorithm (Instant Runoff, rather than traditional majority). Ultimately, I ended up using Counter in place of a defaultdict, but that's how I ended up noticing it.

Not so much ensuring the content so much as setting an initial default state to mutate.

@rhettinger I had considered a comprehension and passing it into a defaultdict (or Counter) to get that nice missing key handling.

Implicitly assigning the factory is a pretty good compromise, like I said fromkeys accepting the factory would be a nice convenience rather than correcting a major oversight.

serhiy-storchaka commented 9 years ago

It may be written simpler:

    d = defaultdict(factory)
    for i in constants:
        d[i]

or

    d = defaultdict(factory, {i: factory() for i in constants})

I'm inclined to reject this proposition. It serves very special use case.

608717b0-b7ef-4161-89bc-6f39ad4ee659 commented 8 years ago

It's inconsistent that defaultdict([]) should be rejected:

  >>> defaultdict([])
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: first argument must be callable or None

but defaultdict.fromkeys([]) is okay:

  >>> defaultdict.fromkeys([])
  defaultdict(None, {})

The constructor signature differs between defaultdict and dict, and defaultdict.fromkeys is an alternate constructor, so it seems reasonable to also change its signature.

Also confusing is that I can call fromkeys on an instance of defaultdict:

  >>> dd = defaultdict(list)
  >>> dd.fromkeys([1])
  defaultdict(None, {1: None})

Instinctively I expect the default_factory to be carried over, even though I realise that would be odd behaviour for Python. If defaultdict.fromkeys were to expect a mandatory default_factory argument there wouldn't be this moment of confusion.

fe5a23f9-4d47-49f8-9fb5-d6fbad5d9e38 commented 8 years ago

That's usual behavior for any class method. You can call them on an instance, but they don't have the access to it, only to its class. So transferring of factory would in fact not be possible. Of course,it's possible to make fromkeys an instance method, but please don't do that.

rhettinger commented 8 years ago

[Serhiy] I'm inclined to reject this proposition. It serves very special use case.

[Silent Ghost] Correct me if I'm wrong but this seem as a very unlikely use case

[Alec Nikolas Reiter] Implicitly assigning the factory is a pretty good compromise, like I said fromkeys accepting the factory would be a nice convenience rather than correcting a major oversight.

-----

After thinking about the above, I'm disinclined to make the API modification.

The current way to do it is straight-forward and reads nicely:

>>> d = defaultdict.fromkeys('abcdefg', somedefault)
>>> d.default_factory = somefunc

Combining those into a single step is less readable and may incorrectly imply some sort of interaction between the factory function and initial population of keys with a constant default value. Likewise, the possibility of implicitly guessing and assigning the factory function is contraindicated by the zen-of-python in at least two places.

Given the paucity of use cases, the potential for API confusion, and the ready availability of reasonable alternatives, I'm closing the feature request. Thank for the suggestion though, it certainly seemed like something that deserved a little more thought.