aiq / basexx

A Lua library which provides base2(bitfield), base16(hex), base32(crockford/rfc), base64(rfc/url), base85(z85) decoding and encoding.
MIT License
84 stars 28 forks source link

Ignore newlines #4

Closed daurnimator closed 8 years ago

daurnimator commented 8 years ago

Newlines should be ignored during base64 decoding.

At the moment you get errors like:

lua: /usr/share/lua/5.3/basexx.lua:94: attempt to perform arithmetic on a nil value (local 'index')
stack traceback:
    /usr/share/lua/5.3/basexx.lua:94: in function </usr/share/lua/5.3/basexx.lua:88>
    (...tail calls...)
    main.lua:19: in main chunk
    [C]: in ?
aiq commented 8 years ago

The rfc4648 tells the following:

Implementations MUST reject the encoded data if it contains characters outside the base alphabet when interpreting base-encoded data, unless the specification referring to this document explicitly states otherwise. Such specifications may instead state, as MIME does, that characters outside the base encoding alphabet should simply be ignored when interpreting data ("be liberal in what you accept").

I think it is better to handle newline characters separately. basexx.from_base64( rmnewlines( "TW\nFu" ) )

I also think to add an optional second argument that defines which characters should be ignored. basexx.from_base64( "TW\nFu", "\n" )

The default functionality will still not allow other characters but it will allow to handle easily your case. I think this day about it.

What is your opinion?

daurnimator commented 8 years ago

I think it is better to handle newline characters separately. basexx.from_base64( rmnewlines( "TW\nFu" ) )

That's what I'm doing at the moment, but I think creating the extra interim string is a waste.

I also think to add an optional second argument that defines which characters should be ignored. basexx.from_base64( "TW\nFu", "\n" )

Sounds reasonable.

aiq commented 8 years ago

Added the functionality to ignore characters.

See https://github.com/aiq/basexx/commit/14faa9d80bd411fbbf1314649ae741df0b0b38b2

daurnimator commented 8 years ago

:( you just use the exact code I was trying to avoid due to wastage. It would be easy for you to build it into the from_basexx function.

aiq commented 8 years ago

What do you mean with wastage?

The generating of one extra string, or that I call ignore_set four times instead of one time in from_basexx.

The extra string is irrelevant in this case because we already create a lot of extra strings. I did not benchmark the functions but I think it is still fast enough.

The call of ignore_set four times is a redundant but I want to handle from_basexx as strict internal function.

I think we can close this issue because the functionality to ignore newlines is now added to the code. I want also add a new feature to the to functions and release after that a new version.

Do you have any further comments?

daurnimator commented 8 years ago

No, consider this closed. I would like to make some performance improvements though.