Closed kikito closed 8 years ago
823: lua = 72bc510bad1c67c3185deb3861c9ab1d php = 72bc510bad1c67c3185deb3861c9ab1d 824: lua = 50937e9b88d5c64d3fc519c2717d3cd0 php = a126fd3611ab8d9b7e8a3384e2fa78a0
"md5.lua 1.0.0"
Hi lxss, thanks for reporting. I have tested this with both LuaJIT and Lua. I got different results!
$ luajit test.lua
72bc510bad1c67c3185deb3861c9ab1d
a126fd3611ab8d9b7e8a3384e2fa78a0
$ lua test.lua
72bc510bad1c67c3185deb3861c9ab1d
50937e9b88d5c64d3fc519c2717d3cd0
It seems that there is an error somewhere in the "pure lua" implementation of binary functions. I will have to give this a look, but I don't have time now.
In the meantime, if you can, try using LuaJIT instead of pure Lua.
I wrote lua project is based on the third-party platform, so I can only use lua libraries I went to try to search the bit lua
I failed, I use bit32 lua libraries 824 calculation is correct, the test result error 2048, a lot of 0. AM 3:00 I went to bed. Thank you for your lua I get up in the continue to study
I have made some advancements; the bit libraries used in lua 5.2 and luajit throw different results.
At some point in the calculation 817 << 22
needs to be computed. This throws different results in Lua and LuaJIT:
$ lua -v
Lua 5.2.3 Copyright (C) 1994-2013 Lua.org, PUC-Rio
$ lua -e "print(require('bit32').lshift(817, 22))"
3426746368
$ luajit -v
LuaJIT 2.0.3 -- Copyright (C) 2005-2014 Mike Pall. http://luajit.org/
$ luajit -e "print(require('bit').lshift(817, 22))"
-868220928
It seems that internally Lua 5.2 represents numbers as unsigned ints, while LuaJIT uses signed ints. Hence the different results. I have no idea about how to make them return the same. I will investigate.
Ok, I will also go to try to solve. I use you write sha1.lua temporary solution file checksum.
Lua 5.2 default is 64-bit If it is a 32-bit, the result is correct. No search how to convert 32-bit
Is the problem in bit32
or bit
?
If it is in bit
this fixes it (returns the same value as Lua bit32
):
Replace this:
local ok, bit = pcall(require, 'bit')
if not ok then ok, bit = pcall(require, 'bit32') end
With this:
local ok, bit = pcall(require, 'bit')
if ok then
local oldbit, bit = bit, {}
local number = function(n)
return tonumber('0x'..oldbit.tohex(n),16)
end
bit.bor = function(...) return number(oldbit.bor(...)) end
bit.band = function(...) return number(oldbit.band(...)) end
bit.bnot = function(...) return number(oldbit.bnot(...)) end
bit.bxor = function(...) return number(oldbit.bxor(...)) end
bit.rshift = function(...) return number(oldbit.rshift(...)) end
bit.lshift = function(...) return number(oldbit.lshift(...)) end
else
ok, bit = pcall(require, 'bit32')
end
If the problem is in the other two (bit32
and pure Lua) Then you'll need this function:
normalize = bit.tobit or function (n)
if n > 0x7fffffff then
return - (bit_not(n) + 1)
else
return n
end
You should normalize all the bitops outputs. So in order to use this function you should check that the library loaded is not LuaJIT BitOps but bit32 or pure Lua
OK,I go to try
Hey @lxss,
I have included (a version of) the change proposed by @Positive07 and the 1111... test seems to pass just fine now in 5.1, 5.2 and LuaJIT. Apparently it was only failing on Lua 5.2, because of its non-standard representation of bits using unsigned items.
The fix is on master, and I have also released a new version to luarocks.
Let me know if the new version fixes your problem, so I can close this issue.
I'm glad that hack worked haha!
@Kikito You are using normalize for bit32
but also for bit
(that doesn't need it) which might come as an slow down which may be negligible but should be taken into account.
Doesnt the pure Lua implementation have the same problem as the bit32
?
@Positive07 Nice catch!
The only Lua which needed "fixing" was Lua 5.2; LuaJIT and Lua 5.1 both work with unsigned (if you give a look at the 5.1 code it contains things similar to your fix - it does if n < 0 then <stuff>
.
I have included an extra if now. LuaJIT simply assigns the functions, while Lua 5.2 normalizes them.
Oh I see now! Nice!
Hello, i got a wrong value too. when i do md5 to a large string (string length: 188731). It's all zero value.
Hi @jayceefun, can you paste the value somewhere (like in a gist)? If I can't reproduce it, I can't fix it.
I can paste it here. It's a lua file actually, and i just wanna encrypt the whole file to a md5 value. But i got this value: "00000000000000000000000000000000"
@jayceefun I have removed your file because it was too long (made scrolling through the issue problematic) and it is possible that github issues add/remove spaces and the like (so the md5 will be different). Please upload it to a gist or some other place which does not reformat text and put the link here.
Also, It would help a lot to know which version of Lua you are using and how exactly you are calculating the md5 of the file in question (the fact that you get all 0s suggests that there might be some other problem somewhere).
@kikito The lua version is 5.1 and I'll check my code again. Thanks for your help :)
@kikito local f = io.input(filePath, "r") local fileData = io.read("*all") local md5Str = md5.sumhexa(fileData)
This is my simple code.. and when the file is not too large it shows correct value of MD5.
I've tried, if the string length more than 63543 , return "00000000000000000000000000000000"
@jayceefun This is probably not it, but just in case: try opening the files in binary mode:
local f = io.input(filePath, "rb") -- notice the "rb" instead of just "r"
@kikito it seems that lua stack has a limit size, i'm not sure. Maybe I should cut the file in pieces to calculate, i tried "rb", still got all 0.
@kikito I'm thinking the issue is with:
string
being too longnumber
becoming too big and being turned into math.huge
or Inf
table
getting too many elements and Lua not handling them rightSo in any case an overflow
Hi,
Are you planning to add an update() function in the md5 library instead? That would alleviate the issue with very long strings. We could just do a for loop and pass in the chunks of the string.
Regards.
Hi, can you elaborate? I don't understand what you mean On Nov 19, 2015 2:00 PM, "stejacob" notifications@github.com wrote:
Hi,
Are you planning to add an update() function in the md5 library instead? That would alleviate the issue with very long strings. We could just to a for loop and pass in the chunks of the string.
Regards.
— Reply to this email directly or view it on GitHub https://github.com/kikito/md5.lua/issues/4#issuecomment-158051202.
Basically, when we need to generate an MD5 from a very long string, we would split the long string into smaller chunks and pass each chunk to the MD5 in an iterative approach. Therefore, at each iteration, we would update the Hash value by looking at the current checksum state.
For example (something like that): //========================= For each block of string Do md5.update(block of string) End local md5Hex = md5.generateHex() //=========================
Source: http://stackoverflow.com/questions/2214259/combining-md5-hash-values
Hope that explains better what I mean. By the way, very good job on the MD5 library. Thank you very much.
FYI, I found alternative implementation https://github.com/tst2005/lua-lockbox/blob/master/lockbox/digest/md5.lua
Someone solved this problem? I want know how to solve it.
This patch fixes it for me. Not sure how it will work in a 64 bit machine.
--- md5org.lua 2016-07-01 09:03:17.000000000 +0200
+++ md5.lua 2016-07-01 08:56:52.000000000 +0200
@@ -346,7 +346,7 @@
c=z(i,c,d,a,b,X[ 2],15,t[63])
b=z(i,b,c,d,a,X[ 9],21,t[64])
- return A+a,B+b,C+c,D+d
+ return bit_and(A+a,-1),bit_and(B+b,-1),bit_and(C+c,-1),bit_and(D+d,-1)
end
----------------------------------------------------------------
The problem appears to be that the results of the additions are often out of the 32-bit range. I haven't checked how the out-of-range results propagate to produce erroneous results. It appears that at least two several turns are necessary.
After a bit more research, I've found that the cause is that the values grow and grow until the precision of the mantissa is not enough to hold the complete integer values. Printing a,b,c,d on each iteration sufficed to show this. The code breaks when reaching 253, because past that number, there are integers that are not representable in double precision.
Based on that information, a more robust patch would be:
--- md5org.lua 2016-07-01 09:03:17.000000000 +0200
+++ md5.lua 2016-07-01 17:05:30.000000000 +0200
@@ -346,7 +346,7 @@
c=z(i,c,d,a,b,X[ 2],15,t[63])
b=z(i,b,c,d,a,X[ 9],21,t[64])
- return A+a,B+b,C+c,D+d
+ return (A+a)%0x100000000,(B+b)%0x100000000,(C+c)%0x100000000,(D+d)%0x100000000
end
@pgimeno thanks for taking the time to investigate this. Would you mind sending a pull request? That way you will appear on the contributors list
Also, if you can, please include a test in the PR which fails with the current implementation but works with yours; probably doing something similar to this line.
@lxss has reported here that md5 is returning the wrong md5 value for long enough strings.
@lxss: How are you calculating the "correct" value? What values are you getting with md5.lua?