hdima / erlport

ErlPort - connect Erlang to other languages
http://erlport.org
BSD 3-Clause "New" or "Revised" License
629 stars 131 forks source link

Support Erlang's map type (coming in 17.0) #15

Open devinus opened 10 years ago

hdima commented 10 years ago

Thanks, I'll investigate it.

okeuday commented 10 years ago

@hdima I was wondering if you would want to combine efforts for the support of the Erlang binary term format in python and ruby. I have https://github.com/CloudI/CloudI/blob/master/src/api/python/erlang.py and https://github.com/CloudI/CloudI/blob/master/src/api/ruby/erlang.rb which I like as a single module focused on Erlang binary term format encoding and decoding. I need to add MAP_EXT, ATOM_UTF8_EXT, and SMALL_ATOM_UTF8_EXT, but these changes would be better as a common dependency to avoid duplication of effort. I was planning on adding both modules to separate repositories which could help this effort.

okeuday commented 10 years ago

@hdima I have modifications for MAP_EXT, ATOM_UTF8_EXT, and SMALL_ATOM_UTF8_EXT at https://github.com/okeuday/erlang_rb and https://github.com/okeuday/erlang_py . I know tests are lacking, but it would be nice to consolidate efforts.

devinus commented 10 years ago

@okeuday Those packages on RubyGems and PyPI would go a long way. Simple an easy, which is just what I was looking for.

zugolosian commented 9 years ago

+1 to @okeuday term format integration

vans163 commented 8 years ago

+1 for this, really needed.

okeuday commented 8 years ago

@devinus https://rubygems.org/gems/erlang_rb/ and https://pypi.python.org/pypi/erlang_py/ have existed for awhile. I have been using the testing from erlport for the various Erlang binary term format implementations which have grown (https://github.com/okeuday/erlang_rb , https://github.com/okeuday/erlang_py , https://github.com/okeuday/erlang_pl , https://github.com/okeuday/erlang_js , https://github.com/okeuday/erlang_php) but the silence on this issue appears to indicate that erlport doesn't want to change its Erlang binary term format source code right now.

devinus commented 8 years ago

Thanks @okeuday these are great projects.

vans163 commented 8 years ago

@okeuday Yea thanks, your erlang_py worked perfect. No issues.

beano commented 8 years ago

I've looked into implementing this for Python. The encoding and decoding routines were easy to to implement but the hashing restrictions in Python don't really match up with Erlang.

For obvious reasons, the following constructs are legal Erlang but illegal in Python:

#{[] => []}
#{#{} => #{}} 

To implement this in Python I think we will need to return frozen types for lists and maps/dicts. We can accept mutable constructs for the encoding routines as well as their immutable/frozen versions. This will be an API change. @hdima , @okeuday, @devinus - what do you think?

okeuday commented 8 years ago

@beano For lists, it seems best to use an object due to the possibility of the Erlang list being improper (when doing binary_to_term, e.g., https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L401-L409). That approach doesn't prevent you from handling a list without an object (when doing term_to_binary, e.g., https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L575-L576). The main problem should be using a frozendict instead of a dict, which is a good solution. I like the implementation at http://code.activestate.com/recipes/414283-frozen-dictionaries/ when using all the comment modifications, and it would be safe to add to the erlang.py implementation since this frozendict inherits from dict. One problem in the erlang.py code is that the binary() function and the hash() function don't have their values cached, so that might be a consideration. So this information isn't focused on erlport, but same concern. (https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L468-L473 shows the modification)

beano commented 8 years ago

@okeuday - thanks for the hints. I've taken them on board and things are progressing pretty well. I still have tests to add write etc. I was going to try to do ruby support but I think my ruby knowledge isn't going to be strong enough. If my changes are accepted would somebody be willing to do the equivalent in ruby?

okeuday commented 8 years ago

@beano As long as a Hash is being used in the Ruby source code for storing the Erlang map there shouldn't be a problem requiring changes (i.e., doesn't seem like Ruby would have a similar problem with any type usage).

beano commented 8 years ago

@okeuday @hdima - I've implemented this for Python2.7. Doing this correctly is a little involved especially as I've avoided breaking backwards compatibility wherever possible.

The decoding routines will always return immutable types. This prevents users from accidentally modifying the returned structures and breaking hash guarantees. I've not optimized anything at this stage.

Most of the unit tests pass without modification. The single exception was opaque objects - dicts were treated as opaque - we now treat them as maps.

Main change: https://github.com/beano/erlport/commit/3e098436ca7e469a564913edcb31905bab798871

Change for OpaqueObjects: https://github.com/beano/erlport/commit/652f273c2750ccf85566ee17397c3bd44aff4cea

Could you please provide feedback before I do Python3 support?

scohen commented 8 years ago

I'm interested in this as well, how can I help?

beano commented 8 years ago

@scohen - I'm not a rubyist so if you want to do Ruby support that would be great! I looked into it and I think the change is a bit easier than in Python. Ruby's hashes and lists are hashable - but using mutable hash keys has the sorts of odd effects you'd expect. Ruby doesn't appear to protect/prevent users from accidentally hashing mutable types. However, Ruby does support freezing lists and hashes which may be a good idea here. I'm not sure that this is necessary as rubyists may already have an expectation that modifying mutable keys is permissible but dangerous. Again, I'm not a rubyist so please feel to ignore this comment if I am mistaken.

I'll try to get Python3 support committed over the weekend.

beano commented 8 years ago

@hdima, @scohen, @okeuday Python3 support is complete in my fork.

vans163 commented 8 years ago

Going to test this. thanks @beano

Just tested. Can you open up issues on your fork beano? I seem to get an OpaqueObject if I pass a map to python 2.7 with your fork.

#{"key"=>5}
OpaqueObject('t\x00\x00\x00\x01m\x00\x00\x00\x03keya\x05', Atom('erlang'))

#Something like this works:
decode_term(map.data)[0]["key"]

But I thought the point was to have it directly as a dictionary in python without anything extra?

beano commented 8 years ago

@vans163 - thanks for your help. I forgot to commit an erlang change. Could you please test again?

vans163 commented 8 years ago

@beano. Thanks. Working :)

beano commented 8 years ago

@vans163 - Excellent. Thanks!

beano commented 8 years ago

@hdima , @scohen, @okeuday, @vans163 - one interesting possibility that this change opens up is support for keyword arguments.

Something like:

Json = python:call(Pid, 'json', 'dumps', [[1, 2, 3]], #{indent => 4}).

This will reduce the need for erlang shim functions and other workarounds when dealing with python functions that take keyword arguments.

I've done a proof of concept and I think that this can be implemented in a minimally invasive way.

What do people think of this idea?

vans163 commented 8 years ago

@beano All my code is shims which motivate me everyday I see them to rewrite in Erlang.

Reminder

Joking aside (pun intended). How would the python side look like?

python:call(Pid, 'test', 'named_args', [], #{indent => 4, timeout => 20000}).
def named_args(indent=0,page=1,timeout=10000)

Something like that I imagine?

beano commented 8 years ago

@vans163 - exactly. The list would contain positional args and the map would contain keyword args.

vans163 commented 8 years ago

@beano, If all old functionality is preserved and nothing breaks I don't see why not. Maybe even do it so that you can do:

python:call(Pid, 'test', 'named_args', #{indent => 4, timeout => 20000}).

and check for is_map.

jclosure commented 7 years ago

+1 for this feature. I am currently having to use msgpax/protobuf to move maps <=> dicts cleanly.

jdoig commented 6 years ago

Here's a fork with @beano 's work + @weatherforce 's Python3 fixes + a patch to make the maps keys stringy (not sure If this breaks anything else though. This is my first time using erlport and I only need to do a couple of very simple fire-&-forget Elixir -> Python calls) https://github.com/jdoig/erlport.

It allows the following (Elixir) code to work (only an example I'm not using erlport to do jsonification ;) ):

{:ok, py} = :python.start([
 {:python, 'python3'},
 {:python_path, './'},
])

:python.call(py, :json, :dumps, [%{"a" => 3}])

Edit: I've just pushed a change to string-ify the values also