erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.32k stars 2.94k forks source link

NEWER_REFERENCE_EXT and 5 ID words in OTP 24 #5097

Closed halturin closed 3 years ago

halturin commented 3 years ago

Documentation https://erlang.org/doc/apps/erts/erl_dist_protocol.html#DFLAG_V4_NC says:

in the reference case up to 5 32-bit ID words are now accepted in NEWER_REFERENCE_EXT

here is an example binary with encoded 5 ID words

<<131,90,0,5,119,19,101,114,108,45,100,101,109,111,49,64,49,50,55,46,48,46,48,46,49,0,0,0,2,0,0,168,104,200,77,0,1,99,248,27,117,0,0,0,0,0,0,0,0>>

but I got the error on attempt to decode it

** exception error: bad argument
     in function  binary_to_term/1
        called as binary_to_term(<<131,90,0,5,119,19,101,114,108,45,100,101,109,111,49,
                                   64,49,50,55,46,48,46,48,46,49,0,0,0,...>>)
        *** argument 1: invalid external representation of a term

for the case with 3 words, it works well

binary_to_term(<<131,90,0,3,119,19,101,114,108,45,100,101,109,111,49,64,49,50,55,46,48,46,48,46,49,0,0,0,2,0,0,168,104,200,77,0,1,99,248,27,117>>).
#Ref<0.1677204341.3360489473.43112>

is this a bug or just a misleading description in the doc?

garazdawi commented 3 years ago

Rickard will have to correct me if I'm wrong here, but when decoding a local reference (i.e. a reference from a node with the current name and the current creation), some integrity checks are made to make sure that the reference is actually a valid reference created by the node.

I'm not sure if this is documented anywhere. At least I did not find any such documentation.

rickard-green commented 3 years ago

@garazdawi is correct. I'll add a note in the documentation about this. @garazdawi also had an idea about producing extended error info for things like this instead of just throwing a badarg. That will, however, be something for the future, since that is quite a lot of work.

halturin commented 3 years ago

what about the reference came from another node? how it should be handled then?

rickard-green commented 3 years ago

If it originates from another node it will be accepted

halturin commented 2 years ago

Erlang node reports me this error if I send the encoded reference with 5 ID words having DFLAG_V4_NC enabled on the Erlang node side image

I have disabled it for a while in my project as a workaround image

rickard-green commented 2 years ago

I think you misunderstood what I meant by "originates". I mean the node that it identifies as it is coming from which should be where it originates from. You cannot get around that by creating the reference on another node and sending it to the node that it identifies as coming from, i.e. originates from.

Looking at your error message this seems to be what is happening. The term which it is complaining about contains a reference with 5 words originating from a node named "erl-demo@127.0.0.1". The complaint is made on a node with the same name. I cannot verify that the creations are equal, but my guess is that they are. This is not expected to work.

By the way. Please paste text instead of pictures of text in issues like these. The pictures makes it much harder to investigate since you cannot easily copy data and test it.

halturin commented 2 years ago

@rickard-green thanks for the reply. let me please provide more details. There are two nodes erl-demo@127.0.0.1 (erlang node) and demo@127.0.0.1 (node written on Golang using Ergo Framework)

Here is the simple test

  1. start node demo@127.0.0.1
  2. start erlang node erl-demo@127.0.0.1
  3. making a call on erl-demo@127.0.0.1 using gen_server:call({example,'demo@127.0.0.1'}, hello) on demo@127.0.0.1 I receive this message [example] HandleCall: "hello", From: gen.ServerFrom{Pid:etf.Pid{Node:"erl-demo-22@127.0.0.1", ID:0x7a, Creation:0x2}, Ref:etf.Ref{Node:"erl-demo-22@127.0.0.1", Creation:0x2, ID:[5]uint32{0x3d746, 0x11080003, 0x832e1b68, 0x0, 0x0}}, ReplyByAlias:false}
  4. gen server example on demo@127.0.0.1 replies with the reference provided by the call request and encodes it as a NEWER_REF with ID len 5 (according to the documentation)
  5. erl-demo@127.0.0.1 shows me this error
(erl-demo@127.0.0.1)25> gen_server:call({example,'demo@127.0.0.1'}, hello).
=WARNING REPORT==== 26-Nov-2021::13:29:27.509596 ===
'erl-demo@127.0.0.1' got a corrupted external term from 'demo@127.0.0.1' on distribution channel 9051
<<...,0,0,0,0,0,0,0,0,104,2,108,0,0,0,1,119,5,97,108,105,97,115,90,0,3,119,18,101,114,108,45,100,101,109,111,64,49,50,55,46,48,46,48,46,49,97,159,144,177,0,1,70,217,241,9,0,2,106,23,74,119,0,0,0,0,0,0,0,0,107,0,2,104,105>>
ATOM_CACHE_REF translations: none

My current implementation has etf.Ref type with a static length of ID

type Ref struct {     
   Node     Atom     
   Creation uint32   
   ID       [5]uint32
} 

it means that any references keep 0 value in the last two elements of this list for the references have come from another node with 3 items in this list.

I guess the length of ID must be the same as it has come from the erl-demo@127.0.0.1. Am I right?

rickard-green commented 2 years ago

Yes, that is correct.