lbryio / lbrycrd

The blockchain that provides the digital content namespace for the LBRY protocol
https://lbry.com
MIT License
2.57k stars 178 forks source link

Claim name returned is strange #119

Closed kaykurokawa closed 5 years ago

kaykurokawa commented 6 years ago

Via Lex:

working on unit tests for the claim name script and using real live blockchain data and am confused about lbrycrd output:

    ```"asm": "OP_CLAIM_NAME 1937006947 080110011a7808011230080410011a084d616361726f6e6922002a003214416c6c207269676874732072657365727665642e38004a0052005a001a42080110011a30add80aaf02559ba09853636a0658c42b727cb5bb4ba8acedb4b7fe656065a47a31878dbf9912135ddb9e13806cc1479d220a696d6167652f6a7065672a5c080110031a404180cc0fa4d3839ee29cca866baed25fafb43fca1eb3b608ee889d351d3573d042c7b83e2e643db0d8e062a04e6e9ae6b90540a2f95fe28638d0f18af4361a1c2214f73de93f4299fb32c32f949e02198a8e91101abd OP_2DROP OP_DROP OP_DUP OP_HASH160 be16e4b0f9bd8f6d47d02b3a887049c36d3b84cb OP_EQUALVERIFY OP_CHECKSIG",
    "hex": "b504636174734cdc080110011a7808011230080410011a084d616361726f6e6922002a003214416c6c207269676874732072657365727665642e38004a0052005a001a42080110011a30add80aaf02559ba09853636a0658c42b727cb5bb4ba8acedb4b7fe656065a47a31878dbf9912135ddb9e13806cc1479d220a696d6167652f6a7065672a5c080110031a404180cc0fa4d3839ee29cca866baed25fafb43fca1eb3b608ee889d351d3573d042c7b83e2e643db0d8e062a04e6e9ae6b90540a2f95fe28638d0f18af4361a1c2214f73de93f4299fb32c32f949e02198a8e91101abd6d7576a914be16e4b0f9bd8f6d47d02b3a887049c36d3b84cb88ac",```

looking at the hex: b5 is correct for OP_CLAIM_NAME, then it's 04 which is correct for length of upcoming string, consuming the next 4 bytes (8 in hex) gives us cats:


'cats'```

so... wtf is `1937006947` as returned by lbrycrd as the claim name?
it's not hex and it's not ascii

### Acceptance Criteria
1.
2.
3.

### Definition of Done
- [ ]  Tested against acceptance criteria
- [ ] Tested against the assumptions of the user story
- [ ] The project builds without errors
- [ ] Unit tests are written and passing
- [ ] Tests on devices/browsers listed in the issue have passed
- [ ] QA performed & issues resolved
- [ ] Refactoring completed
- [ ] Any configuration or build changes documented
- [ ] Documentation updated
- [ ] Peer Code Review performed
tiger5226 commented 6 years ago

Agreed with the following:

b5 is OP_CLAIM_NAME 04 is 4 bytes or 63617473 63617473 is "cats" 4c is OP_PUSHDATA1 dc is 220 bytes or the claim

I don't process the ASM, but the raw data from lbrycrd for the chainquery app because there is different behavior for ASMs that make it unreliable as a source in this manner. Below is a screenshot of all claims by the name "cats". All the ASMs have this problem.

screen shot 2018-04-25 at 7 58 56 pm

Taking a look at the code an interesting situation appears.

https://github.com/lbryio/lbrycrd/blame/270307a29070a4c10095528e232ad0f8e251ec5a/src/core_write.cpp#L90

if (vch.size() <= static_cast<vector<unsigned char>::size_type>(4)) {
                str += strprintf("%d", CScriptNum(vch, false).getint());
            } 

So if the group of bytes is 4 or less then lbrycrd treats it as a littleendian uint16 for the ASM which translates to...drum roll...1937006947. Other wise it takes the hexstr which is what everyone expects to be the output.

To top it off this is also expected in the bitcoin codebase as well below: https://github.com/bitcoin/bitcoin/blob/master/src/core_write.cpp#L95

I also created an issue for the bitcoin repository to see what the functional use case of this is, in case anyone is ever interested in what the developers have to say: https://github.com/bitcoin/bitcoin/issues/13085

kaykurokawa commented 6 years ago

We can delete this if statement: https://github.com/lbryio/lbrycrd/blame/270307a29070a4c10095528e232ad0f8e251ec5a/src/core_write.cpp#L90

Since for our use case it is better to treat data of any size as a hex string

mirgee commented 6 years ago

Are any new unit tests necessary for this, except those already in existence: https://github.com/lbryio/lbrycrd/blob/master/src/test/script_tests.cpp#L1067 ?

lbrynaut commented 6 years ago

We can delete this if statement: https://github.com/lbryio/lbrycrd/blame/270307a29070a4c10095528e232ad0f8e251ec5a/src/core_write.cpp#L90

@kaykurokawa I'm not saying I strongly disagree here, since it's just a string representation, but I also think it can be easily (and consistently) handled upstream since it's the same data.

#include <stdio.h>
#include <arpa/inet.h>

int main(int argc, char** argv)
{
    printf("%d, %x, %x\n", 1937006947, 1937006947, ntohl(1937006947));
    return 0;
}

This prints out 1937006947, 73746163, 63617473, which is what you'd expect.

alyssaoc commented 6 years ago

@kaykurokawa What is the status of this issue? I

kaykurokawa commented 6 years ago

@mirgee provided https://github.com/lbryio/lbrycrd/pull/172 for a fix if someone wants to use it. But to err on the safe side, it shall not be merged in case some applications might expect the decode function to behave as it does in Bitcoin.

A proper solution should detect to see if data to print is part of a claim transaction, and always treat that data as hex if that is the case.

No one is depending on this feature so this is not urgent.

BrannonKing commented 5 years ago

fixed via cherry-pick