harvardnlp / namedtensor

Named Tensor implementation for Torch
http://nlp.seas.harvard.edu/NamedTensor
MIT License
443 stars 42 forks source link

Dict-like interface isn't looking good! #102

Open buriy opened 5 years ago

buriy commented 5 years ago

first_batch = x[{"batch": 1}] three_examples = x[{"batch": slice(1, 4)}]

It doesn't look well, no IDE hints, and constructs a new dict every time.

Would you consider pandas-like interface instead? x[x.batch==1] and x[x.batch>=1 && x.batch<4] syntax looks much better and allows for IDE hints. x.batch[1] and x.batch[1:4] also can be used for this specific case -- that would be amazing!

EndingCredits commented 5 years ago

I was just about to make a separate issue for this, but I'll roll it into this one.

I have been thinking about this issue a bit and this is the solution I've come up with:

Method 1: We set an order for slicing, and then use regular slicing. N.B: We set this order as a variable of the named-tensor, we don't do any transposing of the tensor.

e.g. We can set an order by using t['name1','name2,'name3'] and then access using t[a, b, c] which returns the equivalent to t[ { 'name1': a, 'name2': b, 'name3': c }].

This can be used in a number of ways:

Plus we can set once and then treat our tensor as a normal tensor for a while:

t = t['name1','name2','name3']
t1 = t[a, b, c]
t2 = t[d, e, f]

Or maybe something like:

with t.order('name1','name2','name3'):
    t1 = t[a, b, c]
    t2 = t[d, e, f]

Method 2:

We abuse slicing notation and allow: t1 = t['name1': x, 'name2': y:z, 'name3': w] In this case we have three slices where the start member of each slice is set to the name string. This frees up the stop and step parts of the slice for our use, which we can treat as start and stop for our actual slicing. This does remove the capability to use the step component, although that currently isn't implemented anyway, and can be achieved through other means.

Example: assert t['name1': 3, 'name2': 2:5, 'name3': -1] == t[{'name1': 3, 'name2': slice(2,5), 'name3': -1}]

Both these methods can be used together, although there's some question as to certain minor details of how this would work, e.g. what should the behaviour of t['name1', 'name2', 1, 'name3': 3, 2] be.

If you want to have a play around, I have a basic implementation here: https://github.com/EndingCredits/namedtensor

buriy commented 5 years ago

@EndingCredits Using raw names still isn't good. You seemed to ignore all important points in my message. Someone will write t['neme1'] and it won't work, because it's t['name1']. t[t.name1] is much better, and allows operations like t[t.name1>5] or t.name[1:5]. And your t['name1': 3, 'name2': 2:5, 'name3': -1] even isn't a valid python code ( list indices must be integers or slices, not tuple ). There are two good sources to look for similar well-known solutions: Django API and SQLAlchemy API. Pandas used SQLAlchemy API (and did it lousily).

EndingCredits commented 5 years ago

I was't really commenting on your suggestion, just rolling the discussion into one thread.

Also, looks fine to me

buriy commented 5 years ago

@EndingCredits it is much better when linter/compiler warns you about this, instead of at run-time.

srush commented 5 years ago

Thanks for the suggestions, this is helpful and I hear the two different approaches.

I agree the dict construction approach is a bit heavyweight. particularly in loops. I need to look into whether that is actually a problem though from a memory standpoint.

I see why the x.batch approach is appealing in a an editor (if you somehow declare the names). However [x.batch == 1] seems like a strange way to do constant time indexing, unless you see that being lazy somehow.

buriy commented 5 years ago

@srush for that you might have x.batch[1] as well, and anyway it's no worse than x[{"batch": 1}] .

EndingCredits commented 5 years ago

Memory-wise I imagine the overhead of constructing dicts is negligible compared to the actual tensor operations, unless somehow a tensor itself is being copied.

More importantly, the current way of doing thing is just very clunky, and a bit unintuitive. I feel any of the methods above would be a big improvement to usability.