Mirror of STU-192 on Linear

Question: How do we make it easy and powerful to write flojoy nodes?

Subquestion: What is the purpose of a flojoy node?

Subquestion: What control vs abstraction trade offs do we want to make?

Principles:

A node should be develop-able with minimal boilerplate
A node's full execution 'life cycle' should be controllable and accessible
Publishing node output should be doable via returning from the function (simple lambda nodes)
Node publishing should also be programatically controllable (emitter nodes, hardware nodes, etc.)
Memoization should be manually controllable for nodes
Node source should be compactly serializable and loadable (for node "app store")
Existing nodes should be trivially "plug and play" to the new system with either an HOF or a simple refactor

Note regarding pseudocode style: The pseudocode in this document resembles Java or TypeScript. A bit of mixed syntax, picking language features from both as convenient for brevity and specificity; the real python code will probably look quite different. The use of generics isn't completely valid but should get the point across about which types go where. Feel free to reach out to me (@TheBigSasha ) if you have any questions.

Design Proposal

Base node class defining the interface from the execution perspective

abstract class FJBlock<In, Out> {
/**
return the current name, this can be used for status IE download (idle), then download (50%), then download (done)
*/
public default String name();
/**
 return the current description, which can change based on the context of the node state or settings
*/
public default String description();
/**
**NORTH STAR FEATURE**
return the input names and types of the class, by default use reflection to return the In type, but when In is an aribtrary object, allow fully custom outputs.
This is useful for highly generic nodes like ROS pub sub. Beats using a JSON input / output
*/
public default Pair<String, String>[] getFlowchartInputs();
/**
the function called by the flowchart graph when dependent nodes recieve all inptus
*/
public abstract void on_next(In input);
/**
an unsafe internal function to get the subject (for wiring into the flowchart)
*/
public default Subject ___get_subject___();
/**
 a safe user-oriented abstraction of the observable to publish data to the flowchart
*/
protected default void publish(Out out);
/**
a safe user-oriented hook for memoization (think useMemo or useEffect in React)
*/
protected default T memoize<T>(fn: () => T, deps: Array<Any>);
}

Wrapper hides the class and it's details for basic "lambda block"
Lambda block factory injects hooks and wires up the FJBlock class in the most common way
Lambda block can also be the place were we use the docstring parser to get name, desc, flowChart inputs so that it can be a simple refactor of all existing nodes
A lambda block makes up all "pipeline" blocks, visualization blocks, etc. Any block which do not have internal state

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}

Class blocks are used for nodes which manage a state, including all hardware, download state, etc. Nodes

Downloads can later be abstracted in an make_async_lambda_block but that's not needed for initial proposal


interface NAxisState {
floatState: float[];
discreteState: bool[];
getDefinitions: Pair<int, string>[];
}

class NAxisBlock<{controllerID: int}, NAxisState>{ this.controller = null;

void onNext(input) { memoize(() => { if (this.controller != null){ this.controller.destruct(); } this.controller = getGameController(input.id);

  this.controller.onInput((input: NAxisState) => {
     this.publish(input);
   }
}, [input.controllerId])

} }

- Legacy blocks should be easy to migrate. If we can't use lambda blocks, can we come up with a class block that fully encompasses the function of retro style blocks, IE generates a class block given the function and python doc for a classic flojoy block

```typescript
FJBlock<In, Out> import_legacy_block(block: Function, docstring: Docstring){
    // Write a function to turn a legacy block into an FJBlock instance. Use the docstring for the parameters, title, etc. And use the on_next for running.
}

Function Blocks, Hooks

A block can contain hooks for effects/memoization, and internal state which abstract away the RX subjects.

Publish

The publish or "state" hook works just like ReactJS useState — upon calls to publish (setState), changes are propagated. Otherwise, changes to state do not propagate in the reactive compute graph. This can be used for hardware support, like so:

const oddsOnly =  (next, hooks) => { 
    if (next % 2 != 0){
     hooks.publish(next);
   }
    return;
}

Instead of returning nulls, this function block filters out evens and propagates only odds by using the publish hook.

Memoize / Effect

The memoize hook is particularly useful to ML or other long compute tasks. It takes a lambda and a list of parameter keys, and it runs the lambda once, and then afterwards only when one of the parameter keys changes. Think useMemo and useEffect in React terms. There is no distinction here, since React distinguishes these based on time of run, primarily for DOM access, which is not an issue for us.

Suggestion: Destruct method to blocks to run before flowchart teardown, called when stop signal is recieved.

Thanks Sasha for this writeup, it's a nice roadmap for things to implement in the next few weeks, I just had a few questions while reading.

  /**
    **NORTH STAR FEATURE**
    return the input names and types of the class, by default use reflection to return the In type, but when In is an aribtrary object, allow fully custom outputs.
    This is useful for highly generic nodes like ROS pub sub. Beats using a JSON input / output
  */
  public default Pair<String, String>[] getFlowchartInputs();

I'm not exactly sure what you mean by "fully custom outputs" when In is some arbitrary object, if this is a generic wouldn't it just be a type variable? Python has support for generics via TypeVar so we could just use reflection all the time: (This is python 3.12 syntax, but you can still use generics in 3.11 and earlier)

def foo[T](a: T, b: T) -> T:
    pass

  public default Subject ___get_subject___();
  /**
     a safe user-oriented abstraction of the observable to publish data to the flowchart
  */
  protected default void publish(Out out);

Here, you expose the reactiveX parts in the class. This class mainly looks like an extension of the FCBlockIO class we created during the initial scaffolding of studiolab. In this case, since you want to be able to control exactly when the publish occurs, wouldn't we have to change out to be a subject instead of an observable to control when to push updates?

With the lambda block factory, it makes sense as an easy way to wrap existing blocks, but one thing that was confusing me is that the pseudocode instantiates a FJBlock which is abstract:

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}

I think in the actual implementation we would just have a LambdaBlock class that takes a block function in its constructor and fills in some defaults.

class LambdaBlock(FJBlock):
    def __init__(self, block_func):
        self.func = block_func
        ...

    def on_next(*args, **kwargs):
        return self.func(*args, **kwargs):

    # other default implementations

Also, since you're injecting the hooks through the function parameters, this means when migrating all of the existing function blocks to this format, we would have to add an optional hooks parameter for all of them. Personally, I think it would be fine to restrict the use of effects/memoize/state to class-based blocks only. I think the majority of blocks don't need hooks, and those that do can easily be converted into classes inheriting from FJBlock.

  /**
    **NORTH STAR FEATURE**
    return the input names and types of the class, by default use reflection to return the In type, but when In is an aribtrary object, allow fully custom outputs.
    This is useful for highly generic nodes like ROS pub sub. Beats using a JSON input / output
  */
  public default Pair<String, String>[] getFlowchartInputs();
I'm not exactly sure what you mean by "fully custom outputs" when In is some arbitrary object, if this is a generic wouldn't it just be a type variable? Python has support for generics via TypeVar so we could just use reflection all the time: (This is python 3.12 syntax, but you can still use generics in 3.11 and earlier)

For this, I was thinking about fully runtime defined I/O. For example, a web API node whose output is an object with the fields of what's queried from the API, or a hardware node which outputs differently based on what's connected. I put it as a "north star" feature because it may be difficult to implement or narrowly useful.

  public default Subject ___get_subject___();
  /**
     a safe user-oriented abstraction of the observable to publish data to the flowchart
  */
  protected default void publish(Out out);
Here, you expose the reactiveX parts in the class. This class mainly looks like an extension of the FCBlockIO class we created during the initial scaffolding of studiolab. In this case, since you want to be able to control exactly when the publish occurs, wouldn't we have to change out to be a subject instead of an observable to control when to push updates?

Yes we may have to make changes to out to enable this API, I think the internal reactive structure of the node will need a rework, but that implementational detail may be something we can iron out after establishing the design from a developer interface perspective.

With the lambda block factory, it makes sense as an easy way to wrap existing blocks, but one thing that was confusing me is that the pseudocode instantiates a FJBlock which is abstract:

This is an anonymous class, some relatively arcane Java syntax. My bad for using something that isn't all too common. It's a shorthand for directly instantiating an abstract class by filling in any abstract methods on instantiation. In effect it is the Java equivalent of a method taking a function as input.

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}
I think in the actual implementation we would just have a LambdaBlock class that takes a block function in its constructor and fills in some defaults.

If that's a better way to handle this idea in Python, I think that's fair. There isn't a technical reason it should be a factory instead of a constructor.
class LambdaBlock(FJBlock):
    def __init__(self, block_func):
        self.func = block_func
        ...

    def on_next(*args, **kwargs):
        return self.func(*args, **kwargs):

    # other default implementations
Also, since you're injecting the hooks through the function parameters, this means when migrating all of the existing function blocks to this format, we would have to add an optional hooks parameter for all of them. Personally, I think it would be fine to restrict the use of effects/memoize/state to class-based blocks only. I think the majority of blocks don't need hooks, and those that do can easily be converted into classes inheriting from FJBlock.

I am imagining that the function blocks and hooks are going to be used primarily for short script or transformation nodes written in an in-studio editor. I wouldn't want someone jumping into Flojoy to have to learn the shape of the FJBlock class unless they need to, which they don't if they're making a simple custom block that does some math or something. Do you think there's a better way to approach that use case?

My goal with the function block was dual:

Support legacy blocks with a zero-code-change migration
Create a super low boilerplate API that can be used to create 90% of flojoy blocks (those that don't need to manage custom internal state)

These two goals can also be split off into 2 different implementations if that is deemed to be optimal.

Hi! From the ROS side of things, with this new proposal it will solve:

Could you elaborate on the plan for handling blocks with special dependencies? Would this be the same as it was previously?

@flojoy(deps={"torch": "2.0.1", "torchvision": "0.15.2"})
def DEEPLAB_V3(default: Image) -> Image:
...

Would all dependencies be handled with poetry?

Hi! From the ROS side of things, with this new proposal it will solve:

[BLO-91] ROS 2 [RFC]: Initialization and Shutdown of libraries and framework blocks#30

[BLO-92] ROS 2 [RFC]: Stateful Blocks blocks#29

[BLO-137] ROS 2 [RFC]: Simpler primitive types for DataContainers blocks#28

Could you elaborate on the plan for handling blocks with special dependencies? Would this be the same as it was previously?
@flojoy(deps={"torch": "2.0.1", "torchvision": "0.15.2"})
def DEEPLAB_V3(default: Image) -> Image:
...
Would all dependencies be handled with poetry?

I don't see any reason we can't keep using the old dependency management system. I would say it will stay completely the same. Maybe the annotation above the class instead of the function. @itsjoeoui Thoughts?

Hi @TheBigSasha, thank you for writing this comprehensive RFC.

It seems the "Publish" subsection misses a sentence at its end.

IIRC, poetry was used because conda-lock has some slowness or because conda-forge does not ship all the packages (e.g. for instance the headless version of opencv which is needed since OpenCV's license is tight to Qt's).

Let's wait for @itsjoeoui's answer?

Apart from that, import_legacy_block looks like a good approach a priori to port the all nodes' code to the new blocks.

Hi @TheBigSasha, thank you for writing this comprehensive RFC.

It seems the "Publish" subsection misses a sentence at its end.

IIRC, poetry was used because conda-lock has some slowness or because conda-forge does not ship all the packages (e.g. for instance the headless version of opencv which is needed since OpenCV's license is tight to Qt's).

Let's wait for @itsjoeoui's answer?

Apart from that, import_legacy_block looks like a good approach a priori to port the all nodes' code to the new blocks.

Thanks for the note, I updated the publish section with an example :)

flojoy-ai / studio

[STU 192] - RFC - (Reactive) Blocks Developer Experience #982

Principles:

Design Proposal

Function Blocks, Hooks

Publish

Memoize / Effect