KSC must not know anything about `process` functions

KOLANICH commented 6 years ago

I have implemented a part of cso file format.

It uses 2 different compressions currently unsupported: deflate and lz4. KSC generates incorrect code for them assumming certain structure of their modules. These assumptions are simply wrong. Fortunately the manual fixes are rather small.

For JS target the situation is much worse. KSC fails to compile the ksy for this targets.

The proposal is to shift the knowledge about processors into runtime entirely. Each processor is identified by a string. This string is mapped to a class which is a part of runtime and which gets arbitrary number of arguments. KSC has no assumptions about the class internal structure but knows about its interface.

This construct works nicely for interpreted languages like JS, python and lua since they can import dependencies in runtime. It can work also for Java and C# via reflection.

All the needed imports can be done within the ctor, so if the dependency is not needed, it is not required.

It can be extended to C++ too the following way:

1 no string mapping to a ctor, everything is done in compile time 2 if a function with the name is used, KSC generates a macrodef having a name drived from processor function and uses the classes with the names generated:

#define KAITAI_STRUCT_PROCESS_ZLIB 1

....

ZlibProcessor processor1(args); //need do something here for thread safety

.....

processor1.process(.....)

3 the runtime has the following code for every processor function supported:

#if defined(KAITAI_STRUCT_PROCESS_ZLIB)
public struct ZlibProcessor:public KaitaiStructProcessor{};
#endif


#if defined(KAITAI_STRUCT_PROCESS_ZLIB)
#include <zlib.h>
ZlibProcessor::ZlibProcessor(<needed args>){
  <creates needed objects, saves them>
}

ZlibProcessor::process(void * src, void * dst){.....}
#endif

GreyCat commented 6 years ago

To be frank, I don't understand majority of this proposal.

KSC generates incorrect code for them assumming certain structure of their modules.

In respect to custom processing calls, ksc generates code that assumes you would provide implementation of that custom processor. If you're re-using existing implementation of LZ4, it's just natural that ksc would have no idea about internals of particular implementation, and you would need to provide a wrapper that would translate ksc encode(...) and decode(...) calls into relevant stuff in your existing implementation.

For JS target the situation is much worse. KSC fails to compile the ksy for this targets.

What do you mean by "much worse"? Any examples?

KOLANICH commented 6 years ago

What do you mean by "much worse"?

Generates no code and throws an error. At least in WebIDE.

Any examples?

https://github.com/kaitai-io/kaitai_struct_formats/pull/97

In respect to custom processing calls, ksc generates code that assumes you would provide implementation of that custom processor.

For example for python that code imports packages. It imports packages having the same names stdlib -ackages have. I guess this should be done another way.

1 runtime should provide an abstract base class / interface 2 we should create an own class basing on it and register it in runtime

this approach can be used in interpretable languages

With which modifications this approach can be used in the langs like c++ and rust I have described in the top post.

GreyCat commented 6 years ago

kaitai-io/kaitai_struct_formats#97

Ok, so you've introduced several custom processing formats. Then you're supposed to create a class that will implement every custom processing format, implementing the same kind of interface (i.e. CustomDecoder and/or CustomEncoder). In case of formats like lz4, you won't want to reimplement everything from scratch, so your implementation would be just a wrapper to address some existing library (i.e. for Python or for Java, etc).

GreyCat commented 6 years ago

@KOLANICH I actually went ahead and thought that it would be a good moment to start an "official" collection of processing routines for compression, so here is proof of concept:

https://github.com/kaitai-io/kaitai_compress

This is how you invoke a LZ4 algorithm from this library:

https://github.com/kaitai-io/kaitai_compress/blob/master/_test/ksy/test_lz4.ksy#L6

and here's the actual wrapper "implementation" in Python:

https://github.com/kaitai-io/kaitai_compress/blob/master/python/kaitai/compress/lz4.py

KOLANICH commented 6 years ago

1 https://github.com/KOLANICH/kaitai_compress/blob/fixes/python/kaitai/compress/__init__.py a) I guess it should be called processor because processing is not limited to compression b) 3 methods, one processed, another one inverts processing if it is possible, another ones gets arguments from binarystream. 2 latest are needed for serialization. Also it'd be nice to have inverse processing operations.

2 https://github.com/KOLANICH/kaitai_compress/blob/fixes/python/kaitai/compress/lz4.py

As I have said, we shouldn't use that approach. Initialization may be costly. So we create an object first and then reuse it.

Also note that imports are in functions, they are not imported if a function is never called.

Unfortunately this gives a small performance overhead: on import python che ks if a module is already imported.

GreyCat commented 6 years ago

@KOLANICH You just keep banging on an open door. This things are already implemented in publically released v0.8. If you want to change something, please at least take a look at what's already done.

"KaitaiProcessor" that you propose is actually already implemented as 2 interfaces: "CustomDecoder" (which brings decode method) and "CustomEncoder" (which brings encode method). Stateful initialization exists and one can define arbitrary set of arguments as well — however, I'm not sure if it makes sense to do that for LZ4. If we're going to do that, of course, this set of arguments need to be available for implementations in all languages.

KOLANICH commented 6 years ago

already implemented as 2 interfaces: "CustomDecoder" (which brings decode method) and "CustomEncoder" (which brings encode method).

Thank you for the info.

1 I only found CustomDecoder in Java and C# runtimes. CustomEncoder is not present anywhere in the org. 2 I guess we need to redesign the interface. I gonna do some prototyping first, but I think we need the following features:

forward and inverse transforms
storing local context (to allow on the fly processing of streams without storing them whole in memory)
decoding only ranges of bytes (for on the fly decoding)
storage of some metadata about the encoder itself to optimize reads and take into account limitations

GreyCat commented 6 years ago

Before I begin, I guess I should point to the fact the we have documentation for that, actually.

1 I only found CustomDecoder in Java and C# runtimes.

There's also custom_decoder for C++, but generally, right, because they actually only make sense for statically typed languages.

CustomEncoder is not present anywhere in the org.

Correct, because we don't have #27 which could have used them :(

2 I guess we need to redesign the interface.

The short answer is "No, we do not, at least not at this point".

The longer answer is that in order to do a better interface, one needs to start with how it's being used and answer the question, "what would be a better interface". Here's how it is used now:

// Simple version: we want just byte array
this._raw_buf = this._io.readBytes(50);
MyCustomProcessor _process__raw_buf = new MyCustomProcessor(key());
this.buf = _process__raw_buf.decode(this._raw_buf);

// More complex version: we want user data type in its own IO stream
this._raw__raw_buf = this._io.readBytes(50);
MyCustom _process__raw__raw_buf = new MyCustom(5);
this._raw_buf = _process__raw__raw_buf.decode(this._raw__raw_buf);
KaitaiStream _io__raw_buf = new ByteBufferKaitaiStream(_raw_buf);
this.buf = new Bar(_io__raw_buf, this, _root);

So, how can we make it better? We just need interface that will get bounded IO stream for input and return us another (decrypted, decompressed) stream for output, i.e.:

class MyCustomProcessorStream extends KaitaiStream {
    // ...
}

// Gets byte array
BoundKaitaiStream ioSrc = this._io.substream(50);
MyCustomProcessorStream ioDest = new MyCustomProcessorStream(ioSrc, key());
this.buf = ioDest.readBytesFull(); // KaitaiStream method!

// Gets user data type
BoundKaitaiStream ioSrc = this._io.substream(50);
MyCustomProcessorStream ioDest = new MyCustomProcessorStream(ioSrc, key());
this.buf = new Bar(ioDest, this, _root);

This allows on-the-fly decodings, avoids the problem of gulping whole stream into memory at once, etc. However, this is generally much harder to implement:

To be efficient, this requires #44.
Instead of one decode() method that gets bytearray and returns bytearray, one need to do several dozens of methods like readU1(), readS1(), etc.
One needs to be able to seek through resulting stream, which would be pretty hard to do for most compression algorithms (i.e. to seek to N-th uncompressed byte, you would still need to decode everything before N-th).

On the other hand, simple "bytes in - bytes out" interface was much easier to implement, and what's more important, if and when we'll introduce a better interface, it would be easy to maintain backwards compatibility by wrapping CustomDecoder / CustomEncoder into a stream as it's done now.

KOLANICH commented 6 years ago

On the other hand, simple "bytes in - bytes out" interface was much easier to implement, and what's more important, if and when we'll introduce a better interface, it would be easy to maintain backwards compatibility by wrapping CustomDecoder / CustomEncoder into a stream as it's done now.

That;s what I suggest to do.

1 KSC desides if the object can/should be static. It is if the params are known in compile time. 2 it generates the code

fac = Processor(params)

3 when the data should be decoded it spawns a context

ctx = fac(data)

4 using this context the decoded data can be accessed

ctx[start:stop]

One needs to be able to seek through resulting stream, which would be pretty hard to do for most compression algorithms (i.e. to seek to N-th uncompressed byte, you would still need to decode everything before N-th).

That's why we need context. We can do differrent things there.

Hash functions - we precompute the value and store it in the context, on access return its parts requested
compression - we can store decompressor's state and use it not to start from scratch
xor, ctr - store only the static values, everything else is parallel
stream of hashes ctx[0:block_size]=H(data), ctx[i*block_size, (i+1)*block_size ]=H(ctx[(i-1)*block_size, i*block_size]) is pretty unseekable, but we can cache some previous pairs (i, ctx[i*block_size, (i+1)*block_size ]) and assume some locality.

kaitai-io / kaitai_struct

KSC must not know anything about `process` functions #457