Closed MCJack123 closed 3 years ago
@MCJack123 Thanks. I have reviewed your code. Thank you for your contribution.
It is okay to introduce dependencies, as long as that is only needed for gzip functionality. Can you implement some of the following checklists when you have time? I currently have a busy work schedule, will help in my early May vocation.
[ ] Pass CI. tests/Test.lua:2954. Add your the name of the new function in the table there. This is to avoid breaking backward compatibility accidentally.
[ ] LibDeflate only throws error when the type of input argument is invalid. No error is thrown if argument type is valid, but data is invalid and cant be decompressed. In this case, two element, (nil, error_code) should be returned. Because I haven't defined an error code in the documentation, you can use any number as the error code in your pull request.
[ ] Make gzip decompressor comply RFC 1952 standard. The integrity of input data to be decompressed must be checked. Additionally, though not required by the standard, I think CRC16(if FLG.FHCRC set), CRC32 and ISIZE should be checked.
RFC 1952 2.3.1.2. Compliance A compliant decompressor must check ID1, ID2, and CM, and provide an error indication if any of these have incorrect values. It must examine FEXTRA/XLEN, FNAME, FCOMMENT and FHCRC at least so it can skip over the optional fields if they are present. It need not examine any other part of the header or trailer; in particular, a decompressor may ignore FTEXT and OS and always produce binary output, and still be compliant. A compliant decompressor must give an error indication if any reserved bit is non-zero, since such a bit could indicate the presence of a new field that would cause subsequent data to be interpreted incorrectly.
[ ] Support gzip compression and comply RFC 1952 standard.
RFC 1952 2.3.1.2. Compliance A compliant compressor must produce files with correct ID1, ID2, CM, CRC32, and ISIZE, but may set all the other fields in the fixed-length part of the header to default values (255 for OS, 0 for all others). The compressor must set all reserved bits to zero.
[ ] Support gzip compression and decompression when calling LibDeflate from the commandline.
[ ] Add tests for your code, and maintain 100.00% code CI tests coverage. Read tests/README.md and .appveyor.yml for tests instructions.
Thanks for your contributions.
I've added a compressor and fixed compliance, but something is causing an error in the tests, so I'm going to need to spend a bit more time debugging that.
As long as most tests passed, it's okay. I'll fix failed tests. You can search "Fail" in the CI log to figure out which test is failing. Some tests I wrote are probably not needed. Some are workarounds in order to get 100% test coverage.
Dont assume "os" and "io" module exists. LibDeflate is originally written for games embeded Lua, which strip away os and io module
Can you give an instruction how to work with ComputerCraft,so I can test it?
What is the size of your big files? LibDeflate is slow when work with large files. It should work for large files though. I did test 50M input data. But anyway, prefer compressor written in C with Lua binding when working on big files.
LibDeflate gzip functionality should be tested against "gzip" See tests/Test.lua:444
Windows CI on Appveyor may not have "gzip" installed. "choco install gzip" may need to be added to .appveyor.yml
I will work on your PR on May 1st, hopefully get a Beta version on May 2nd.
I've added a compressor and fixed compliance, but something is causing an error in the tests, so I'm going to need to spend a bit more time debugging that.
What is the size of your big files? LibDeflate is slow when work with large files. It should work for large files though. I did test 50M input data. But anyway, prefer compressor written in C with Lua binding when working on big files.
I need to be able to compress 700k+ files in ComputerCraft, which is pure Lua. ComputerCraft will halt the Lua computer if it runs for 7 seconds without calling os.pullEvent()
. This is to prevent infinite loops in programs, but can cause problems for programs that require long processing.
Compress file with 700k size, even on compression level 9, shouldn't take 7s. If long processing causes the problem, you can split the file data into multiple parts and call LibDeflate:Compress multiple times and then concat the result
btw, your code is failing the static analyzer check in CI, which is run at the beginning of the CI. Also, I have replied some of your comments above.
I have pushed several changes into the develop branch of this repository.
I made a mistake. Computercraft uses LuaJ as the interpreter. Not Luajit. CI will be added to LuaJ and the ComputerCraft Fork of LuaJ.
I will test more Lua interpreter implementations and try to add workarounds if possible. There is a bug in LuaJ 2.0.3 that modifying table during traversal is not possible. I will add work around for it.
I decide to not merge code that contains ComputerCraft specific APIs because I have never played Minecraft before and I cannot test those features in CI. Also it increases the risk of naming conflict with other lua interpreter because ComputerCraft API does not use special name for its API, and it's possible that ComputerCraft users write those code outside of LibDeflate without modifying LibDeflate.lua.
My current plans is to add tests for another 5 Lua interpreters in CI on the current develop branch, when passed, start to merge your code.
I added gzip -t -v
to the gzip tests to print info about the compressed gzip. Should I keep this for debugging, or remove it?
gzip -t -v
should be okay, unless it makes the test log bigger than the limit of TravisCi or Appveyor.
You can convert math.floor(x)
to (x-x%1)
, then LibDeflate will no longer require a Lua interpreter which implements the "math" module.
Have you pass all tests locally? I suggest to pass tests locally first in LuaJIT.
CI is at least 5 times slower than the current develop branch. I guess it is caused by the CRC32. The current dev branch runs CI in 13min. This PR has exceed 50min limitation of Travis CI.
One solution is to disable gzip tests in original Lua, only enable it in LuaJIT.
I am trying to found if there is any room to optimize CRC32 without installing any "bit" module
Made a faster Pure Lua implementation of CRC32, without using any external library. It is faster than level 1 LibDeflate compression with the same input size.
There is still room to optimize. But now I think it's safe to remove optional "bit" or "bit32" dependencies, because it only gives 2 times speedup, for crc32 only.
It takes 200ms to generate lookup table though, in the classic Lua5.1 interpreter. It uses 64K crc32 lookup table instead of 256. NOTE: In LibDeflate, lookup table should only be generated when LibDeflate:Crc32 is called for the first time. Shouldn't generate it when LibDeflate is imported.
Also, this should solve the ComputerCraft specific problem that compressing file with size of 700k takes more than 7s.
https://gist.github.com/SafeteeWoW/080e784e5ebfda42cad486c58e6d26e4
Also, bit.bxor
return signed int32 value instead of uint32. I think it is the reason why CI is reporting invalid crc32. To convert to uint32, mod the result by 4294967296 (2^32)
This project uses "TAB" as the indention, not "4 spaces". This is to follow Blizzard style, because this project is originally written as the addon to World of Warcraft.
The Lua used in ComputerCraft, LuaJ 2.0.3 is more than10 times slower than classic Lua interpreter. I suggest you to use LibDeflate to work than files less than 200k. Compression for 100k files is already over 10s.
Merging #2 into master will decrease coverage by
1.29%
. The diff coverage is87.09%
.
@@ Coverage Diff @@
## master #2 +/- ##
========================================
- Coverage 100% 98.7% -1.3%
========================================
Files 1 1
Lines 1709 1857 +148
========================================
+ Hits 1709 1833 +124
- Misses 0 24 +24
Impacted Files | Coverage Δ | |
---|---|---|
LibDeflate.lua | 98.7% <87.09%> (-1.3%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 07f9d36...b053929. Read the comment docs.
If you want to increase code coverage:
Because Lua tests with code coverage tracking enabled is very slow, code coverage is only enabled with LuaJIT. And only subset of tests which can improve code coverage are enabled in CodeCoverage.
See AddToCoverageTest() function in tests/Test.lua
https://gist.github.com/SafeteeWoW/080e784e5ebfda42cad486c58e6d26e4
My version of crc32 updated. Switch back to 256 elements crc32 table to solve long time of caching.
Gonna start to merge your code now. Currently working on "test_more_interpreter" branch of this repo and will merge your code to there.
Your changes have been merge in test_more_interpreter.
However, all codes to work around ComputerCraft / LuaJ 2.0.3 have been removed. You are welcome to maintain a fork of LibDeflate to make it work for ComputerCraft.
This is due to the following reasons:
Making some relative big changes with code style fixes on test_more_interpreter branch
I am considering to add a streaming interface, which is similar to Zlib. This will allow you to compress big files with multiple separate function calls to LibDeflate with an arbitrary size of small chunk of input data, so you can os.pullEvent() or give control to other program between these function calls. Decompression will also provide similar streaming interface.
It basically looks like this:
state = DeflateInit(configs)
Deflate(input_chunk0, state)
-- Do whatever you want between these function calls.
Deflate(input_chunk1, state)
Deflate(input_last_chunk, state)
compressed_data = DeflateEnd(state)
Close for now. May add gzip function later
I added a basic LibDeflate:DecompressGzip function. Tested with gzip 1.6 and Lua 5.1. Uses some bit functions that are included, but this may affect code style.