IBMStreams / streamsx.json

Toolkit for working with JSON in SPL applications.
http://ibmstreams.github.io/streamsx.json/
Other
3 stars 19 forks source link

Anonymous namespace moved out from header #ifdef guard. #99

Closed leongor closed 6 years ago

leongor commented 6 years ago

Get separated parse/query functions for each operator.

rnostream commented 6 years ago

@leongor Please could you add some information about the reason/requirement for this change. Which problems are solved or which advantages do one have with the change? Best would be an issue explaining these things. What tests are performed? Generally PR's should be guarded by providing test or samples.

leongor commented 6 years ago

@rnostream This PR is a patch for a problem that appeared in Streams Designer. The challenge when using a stateful function is to provide the unique state for each combination of operators, threads and multiple json documents.

  1. Multiple json documents are indexed by templating the function with Index.
  2. Concurrent threads are handled by storing a json document in static thread_specific_ptr variable.
  3. Multiple operators (in fused PE) are handled by putting the functions in anonymous namespace that is unique for each compilation unit (i.e. operator).

In the latter case it's important that compiler will create dedicated functions for each operator, but #ifdef guard was preventing it. This patch is moving the anonymous namespace block out of the #ifdef guard.

rnostream commented 6 years ago

@leongor Sorry for my question, templating is not the thing I've much experience. But GetDocument has a typename template parameter . Isn't there only one function instantiated by the compiler? The function for the Index type as GetDocument is anytime parameterized with Index as type? There is no specialization or nontype parameter instructing the compiler to generate more than this one instance and as such there would be only one document available?

leongor commented 6 years ago

@rnostream Once a function is templated, the compiler instantiates a number of functions as many as number of types. Each SPL enum in JsonIndex composite like _1, _2 and so on, when used as a parameter, creates a separate function instance. So if in some SPL there are two function calls: parseJSON(jsonString, JsonIndex._1) and parseJSON(jsonString, JsonIndex._2) the compiler will instantiate two different functions.

rnostream commented 6 years ago

@leongor Your sample for function calls is runtime behavior. For templates we need compiletime constants and there are no function declarations having single enum values. There is only one GetDocument() instance generated, my opinion.

leongor commented 6 years ago

@rnostream JsonIndex._1 and JsonIndex._2 are enum values of two different types: JsonIndex.type_1 and JsonIndex.type_2 accordingly, not two values of the same enum! (just to remind, SPL enums are generated c++ classes, not c++ enums). That's the reason two different functions will be generated by a compiler, one that overloaded to accept value of enum type JsonIndex.type_1 and other that overloaded to accept value of enum type JsonIndex.type_2. This happens on compilation step.

rnostream commented 6 years ago

OK, you got me. I wasn't aware at moment that you defined the index values this way: each being own type. And yes with this, each use in SPL with an different index aka type instructs the compiler to generate an template instance for this type, resulting in index specific GetDocument().

rnostream commented 6 years ago

But good to have this discussed because it's worth to be documented for those trying to understand ... the key was the types.spl.

schubon commented 6 years ago

Ich wünschte, dass mir gestern, als ich Dir sagte, dass manche Sprachen Enum values in separate Klassen compilieren, mir ja auch noch hätte einfallen können, dass es tatsächlich SPL war, bei der ich das gesehen habe ...

Sorry for that ...

MfG Norbert Schulz Software Engineer, IBM Streams and Telco Projects Watson & Cloud Platform Development Lab Germany

SUT: 49-(0)7034-2748817 Cellular: 49-(0)172-8573313 E-mail: schulz2@de.ibm.com

Nahmitzer Damm 12 12277 Berlin Germany

IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

From: rnostream notifications@github.com To: "IBMStreams/streamsx.json" streamsx.json@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Date: 16.03.18 10:32 Subject: Re: [IBMStreams/streamsx.json] Anonymous namespace moved out from header #ifdef guard. (#99)

OK, you got me. I wasn't aware at moment that you defined the index values this way: each being own type. And yes with this, each use in SPL with an different index aka type instructs the compiler to generate an template instance for this type, resulting in index specific GetDocument(). ? You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

rnostream commented 6 years ago

@leongor One more question ;-) ... what was the reason to choose this somehow complicated template construct to implement an index based storage and access of a bunch of documents and not any collection approach? Just for interest.

leongor commented 6 years ago

@rnostream The goal was to combine a power of operator that can store a state as a private member and a flexibility of function that can be called in any SPL context multiple times or even in a loop - with a maximum performance gained. A map of json documents or any other standard collection could be used, but they are not thread safe, so additional logic should be applied to handle multiple threads. In templated functions approach the whole management is applied at the compile time, so when running, there are no locks, no container memory management. I consider Streams as a near real time environment, so any performance boost is not just welcome, but must from my point of view.