Closed FCO closed 5 years ago
Make it possible to run different code parallelly on different (or not) databases
% perl6 -I. -e '
use Red <red-do>;
model Bla { has UInt $.id is serial; has Str $.name is column }
red-defaults "SQLite", :database<test.db>;
say red-do { Bla.^load: 1 }, { Bla.^load: 2 }, { Bla.^load: 3 }
'
(Bla.new(id => 1, name => "bla") Bla.new(id => 2, name => "ble") Bla.new(id => 3, name => "bli"))
% perl6 -I. -e '
use Red <red-do>;
model Bla { has UInt $.id is serial; has Str $.name is column }
red-defaults "SQLite", :database<test.db>;
say await red-do { start Bla.^load: 1 }, { start Bla.^load: 2 }
'
(Bla.new(id => 1, name => "bla") Bla.new(id => 2, name => "ble"))
% perl6 -I. -e '
use Red <red-do>;
model Bla { has UInt $.id is serial; has Str $.name is column }
red-defaults "SQLite", :database<test.db>;
say await red-do { start Bla.^load: 1 }, { start Bla.^load: 2 }, { start Bla.^load: 3 }
'
An operation first awaited:
in block <unit> at -e line 6
Died with the exception:
Unknown Error!!!
Please, copy this backtrace and open an issue on https://github.com/FCO/Red/issues/new
Driver: Red::Driver::SQLite
Original error: X::DBDish::DBError.new(driver-name => "DBDish::SQLite", native-message => "not an error", code => 1, why => "Error")
Original error:
DBDish::SQLite: Error: not an error (1)
in method handle-error at /Users/fernando/.rakudobrew/versions/moar-2019.03.1/install/share/perl6/site/sources/9FB62DC76EFA166DFBA147ED75C743F9BE8BA042 (DBDish::SQLite::Connection) line 17
in method prepare at /Users/fernando/.rakudobrew/versions/moar-2019.03.1/install/share/perl6/site/sources/9FB62DC76EFA166DFBA147ED75C743F9BE8BA042 (DBDish::SQLite::Connection) line 26
in method prepare at /Users/fernando/Red/lib/Red/Driver/SQLite.pm6 (Red::Driver::SQLite) line 47
in code at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 22
in code at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 21
in method prepare at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 18
in submethod TWEAK at /Users/fernando/Red/lib/Red/ResultSeq/Iterator.pm6 (Red::ResultSeq::Iterator) line 14
in method iterator at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 72
in method Seq at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 78
in method do-it at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 83
in method head at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 316
in method load at /Users/fernando/Red/lib/MetamodelX/Red/Model.pm6 (MetamodelX::Red::Model) line 381
in code at -e line 6
Actually thrown at:
in block at /Users/fernando/Red/lib/Red/Driver/SQLite.pm6 (Red::Driver::SQLite) line 43
in any at /Users/fernando/Red/lib/Red/Driver/SQLite.pm6 (Red::Driver::SQLite) line 41
in method prepare at /Users/fernando/.rakudobrew/versions/moar-2019.03.1/install/share/perl6/site/sources/9FB62DC76EFA166DFBA147ED75C743F9BE8BA042 (DBDish::SQLite::Connection) line 39
in method prepare at /Users/fernando/Red/lib/Red/Driver/SQLite.pm6 (Red::Driver::SQLite) line 47
in code at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 22
in code at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 21
in method prepare at /Users/fernando/Red/lib/Red/Driver.pm6 (Red::Driver) line 18
in submethod TWEAK at /Users/fernando/Red/lib/Red/ResultSeq/Iterator.pm6 (Red::ResultSeq::Iterator) line 14
in method iterator at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 72
in method Seq at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 78
in method do-it at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 83
in method head at /Users/fernando/Red/lib/Red/ResultSeq.pm6 (Red::ResultSeq) line 316
in method load at /Users/fernando/Red/lib/MetamodelX/Red/Model.pm6 (MetamodelX::Red::Model) line 381
in code at -e line 6
But it works on with Pg:
% perl6 -I. -e '
use Red <red-do>;
model Bla is table<not_bla> { has UInt $.id is serial; has Str $.name is column }
red-defaults "Pg";
Bla.^create-table: :if-not-exists;
say await red-do { start Bla.^load: 1 }, { start Bla.^load: 2 }, { start Bla.^load: 3 }, { start Bla.^load: 4 }, { start Bla.^load: 5 }
'
(Bla.new(id => 1, name => "bla") Bla.new(id => 2, name => "ble") Bla.new(id => 3, name => "bli") Bla.new(id => 4, name => "blo") Bla.new(id => 5, name => "blu"))
I'm thinking of accepting Positionals and not Positionals, so when Positional, it will respect the order. Otherwise, it will try to run everything in parallel...
% perl6 -I. -e '
use Red <red-do>;
model Bla is table<not_bla> { has UInt $.id is serial; has Str $.name is column }
red-defaults pg => \("Pg", :default), sqlite => \("SQLite");
red-do <pg sqlite> => { Bla.^create-table: :if-not-exists };
say red-do (<pg sqlite> => { Bla.^load: 6 }), "sqlite" => { Bla.^load: 7 }, (pg => { Bla.^load: 8 })
'
(Bla.new(id => 6, name => "b") Nil Nil Bla.new(id => 8, name => "c"))
This question would also involve red-do
syntax. This is, perhaps, good time to specify what is expected from it. To my view:
red-do { ... }
red-do 'db1' => { ... }, 'db2 => { ... };
For readability purposes red-do
might accept :with
named paramter:
red-do :with<db1>, { ... }, { ... };
:async
or :parallel
named parameters would result in code blocks being executed asynchronously:
red-do :async, :with<db1>, { ... }, { ... }, ...;
red-do :parallel, 'db1' => { ... }, 'db2' => { ... }, ...;
Convenience alias for the above red-do-async
might be considered too.
The internal implementation of the asynchronicity is not relevant. Perhaps race
would work well enough, or each block could simply be run within dedicated start
– any option would be good. But what is important is that I expect parallelizing to be heavily used in cases when data coming from one (or few) databases should be processed and re-dispatched into another database(s). A user can always create own means of inter-block communication, but Red could provide the most basic approach out of the box for maximum simplicity. I haven't think out the actual syntax and semantics well enough, but here is what I may propose.
Routines red-emit
and red-tap
are to be provided with the following signature:
sub red-emit(Str:D $name, *%named --> Supplier) { ... }
sub red-tap(Str:D $name, &block, *%named --> Supply) { ... }
Calling any of them would result in creating a new Supplier
under the specified $name
unless one is already exists. Then this supplier could be used to transfer data between asynchronous blocks. All created suppliers are valid within the execution time of their enclosing red-do
and will cease to exists when the routine finishes.
For example:
red-do-async
'db1' => { red-emit "mytest", fetch-data },
'db2' => { red-tap "mytest", { process-data $_ } },
'db3' => { red-tap "mytest", { store-duplicate $_ } },
;
For convenience and syntax sugar red-emit "mytest" => { ... }
could also be supported.
Similarly, routines red-send
, red-receive
could be provided to support queuing via Channel
.
A side note: red-do
can also support more advanced syntax for implicit asynchronous operation via the following signature:
sub red-do (*%named) { ... }
Then if used like:
red-do db1 => { ... }, db2 => { ... };
It would result in running the blocks in parallel. The point is in not using quotes around database names turning them into named parameters instead of a list of Pair
s. In this case manual handling for the named parameters would be required, but generally this could be implemented as simple as:
sub red-do (*%named, *@pos is rw) {
my @pos;
my %n;
for %named.pairs {
if .value ~~ Code {
@pos.push: $_
}
else {
%n.append: $_
}
}
samewith(|@pos, |%n, :async)
}
I seem to be out of sync here, but that because I started writing the comment about 7 hours ago and finished just now. :)
This supplies/suppliers should be shared between red-do’s or should exist only on the red-do where it was created? red-do-async should block or should it return a promise? Why not use a react block? Thank you
Sent with GitHawk
Suppliers and channels are only for the red-do
which creates them. Otherwise it'd be prone for leaking memory problems. One would have to create own, controllable, means of communication for cross-red-do
things.
I think red-do
must block by default on both sequential and async operations. This is the part of behavior which must not change. But as to red-do-async
– it's a good question! Perhaps it's a good idea to have it return a Promise
and send its blocks into threads.
@vrurg would you mind to write an example of use to that supply/channel? I couldn't "see" that yes...
I mean, I keep with something like this on my head:
supply {
red-do <db1 db2> => { whenever start { Model.^all.grep: {...} } -> $result { .emit for $result } }
}
No, that's too much! Say, we need to duplicate records from one db into many:
red-do :parallel,
db-source => {
red-emit "dup", $_ for Model.^all
},
db-dest1 => {
red-tap "dup", -> $record {
Mode.^create: $record
}
},
db-dest2 => {
red-tap "dup", -> $record {
Mode.^create: $record
}
}
And that's all. For this to work you would need to create a Supplier
for dup and then re-use it.
BTW, note that tap is an async co-routine. So, basically, the above example doesn't even need :parallel
unless receivers plan to do some additional work outside of the tap.
red-do :parallel,
db-source => {
red-emit "dup", $_ for Model.^all
},
<db-dest1 db-dest2> => {
red-tap "dup", -> $record {
Mode.^create: $record
}
}
Maybe this way we avoid to duplicate code...
I don't think this would be often needed. I've done it this way just to make the example clean. In real life it would be something like:
sub store-record ($record) { ... }
red-do :parallel,
...
db-dest1 => &store-record,
db-dest2 => &store-record,
or, if it happens to be many destinations:
my %p = @all-dests.map: * => &store-record;
red-do :parallel, |%p
...
I see no need to overcomplicate in this case.
@vrurg now this is possible (the pg was pre populated):
% perl6 -I. -e '
use Red <red-do>;
model Bla is table<not_bla> { has UInt $.id is serial; has Str $.name is column }
red-defaults pg => \("Pg", :default), sqlite => (my $sqlite = database("SQLite"));
red-do <pg sqlite> => { Bla.^create-table: :if-not-exists };
red-do :async,
{
red-emit "dup", $_ for Bla.^all
},
:sqlite{
red-tap "dup", -> $record {
$record.^save: :insert
}
},
;
red-do
$sqlite => { say "sqlite => ", $_ for Bla.^all.batch(3).head },
"pg" => { say "pg => ", $_ for Bla.^all.batch(3).head },
'
sqlite => Bla.new(id => 1, name => "bla")
sqlite => Bla.new(id => 2, name => "ble")
sqlite => Bla.new(id => 3, name => "bli")
pg => Bla.new(id => 1, name => "bla")
pg => Bla.new(id => 2, name => "ble")
pg => Bla.new(id => 3, name => "bli")
Maybe we should have a supply on the Driver that would emit everything that happens on that driver... and the user could grep what it wants.
red-do {
.events.grep({ .event-type eq "create" and .model === MyModel }).tap: $logger.debug: "Created: { .obj }"
}
and it would create a log entry ($logger
is just an example) for every new created MyModel object.
Maybe the event class could be something like:
class Event {
has enum <create delete update> $.event-type;
has Red::Model:U $.model;
has Red::Model $.obj;
has %.change;
}
or maybe it should be a model to make it possible to the user store it on a database...
It's something I would expect to be quite appreciated in a big production. I would add that perhaps it also makes sense to have supply of events for all active drivers allowing for unified processing of everything. Whoever needs just one stream – subscribes to individual driver. Whoever needs everything could use $*RED-EVENTS
or alike.
With regard to the Event
class – I don't like the idea of limiting the event types in either way. Perhaps it'd make more sense of passing the AST node bound to the event? Or, let's try to generalize it this way: unless I'm mistaken, any event has an object associated with it. No matter what kind of object it is as the system could emit AST for pre-execute event, and could emit a model object for post-execute or a Failure
if execute fails. In this case it'd do more sense to replace fir three attributes with just $.object
and by introspecting the object we could determine what kind of even is received.
A downside of this approach is possible performance cost, though it shouldn't be very influential in a parallel model as even emission and processing could take place in their own, likely shared, thread. But a way to set what kind of events a consumer is interested in would undoubtedly be reasonable to have.
No idea what is %.change
is responsible for.
PS. It feels like loading you with extra work. ;)
I really like the idea of passing the AST.
I’m thinking of the unified supply to be called Red.events
.
Maybe, red-emit should emit on the driver’s supply and .red-tap: $tap-name, ...
should tap Red.events.grep({
.db === $driver &&.name eq $tap-name })
.
But I think we need to have a standard event class on it to make it easier to .grep
what you want. And red-emit
would be the way to automatically creating that class... so now I’m thinking on:
class Event {
has Red::Driver $.db;
has Str $.db-name;
has Str $.driver-name;
has Str $.name;
has $.data;
}
And on automatic events .data
would contain it the AST and on red-emit
ones it would contain anything passed.
About the %.change
It would contain the previous and the new values when it was a update... and that’s something I couldn’t find a way to add on this new way (the update AST has only the new values)
Or maybe:
class Event {
has Red::Driver $.db;
has Str $.db-name;
has Str $.driver-name;
has Str $.name;
has $.data;
has Red::AST $.ast;
has Red::Model $.object;
has X::Red $.error
}
And ‘red-tap “bla”, {...}` should tap:
Red.events.grep(*.name eq “bla”).map(*.data)
And that would not need to be restricted to a single red-do
I could have been not clear enough, but I really didn't meant to get rid of Event
(eh, Red::Event
, I guess?) class. I was only about removing redundant attrs. From this point of view the last class variant is overloaded. To my view $.data
should be sufficient. Upon receiving an event one would just:
given $event.data {
when Red::Ast { ... }
when Failure { ... } # or X::Red
...
}
The original model object which was the source for an UPDATE
could be passed as an additional attribute. Something like $.origin
. Final event could be then:
class Event {
has Red::Driver $.db;
has Str $.db-name;
has Str $.driver-name;
has Str $.name;
has $.data;
has Model $.origin;
}
Using the system-wide event supplier for red-emit
and red-tap
makes it actually unnecessary to enclose named data streams within one red-do
. And perhaps it's even better this way because it would then be possible to start a separate thread with its own red-do
and attach to a data stream fulfilled by the mainline, for example.
The original point of limiting data streams by enclosing red-do
was only to prevent memory hog by multiplying Supply/Supplier
objects. In your approach it's not needed because a single Supplier
provides service for all subscribers.
One thing which worries me a bit is use of share $.name
. I.e. if Red
is using events for something and binds to a name then this name must be a reserved one and not available to a user. To minimize possible collisions I'd propose reserving whole Red:: namespace. I.e.:
red-emmit "Red::MyStream", $data;
Started:
fernando@MBP-de-Fernando Red % perl6 -Ilib -MRed -e '
red.events.tap: -> $ev { say "red => ", $ev }
my $*RED-DB = database "SQLite";
model Bla { has UInt $.id is id }
Bla.^create-table: :if-not-exists
'
red => Red::Event.new(db => Red::Driver::SQLite.new(database => ":memory:", events => Supply.new), db-name => "Red::Driver::SQLite", driver-name => Str, name => Str, data => Red::AST::CreateTable.new(name => "bla", temp => Bool, columns => Array[Red::Column].new(Red::Column.new(attr => bla.id, attr-name => "id", id => Bool::True, auto-increment => Bool::False, references => Callable, nullable => Bool::False, name => "id", name-alias => "id", type => Str, inflate => { ... }, deflate => { ... }, computation => Any, model-name => Str, column-name => Str, require => Str)), constraints => Array[Red::AST::Constraint].new(), comment => Red::AST::TableComment), model => Bla, origin => Red::Model, error => Exception)
fernando@MBP-de-Fernando Red % perl6 -Ilib -MRed -e '
red.events.tap: -> $ev { say "red => ", $ev }
my $*RED-DB = database "SQLite";
model Bla is table<---invalid---> { has UInt $.id is id }
Bla.^create-table: :if-not-exists
'
red => Red::Event.new(db => Red::Driver::SQLite.new(database => ":memory:", events => Supply.new), db-name => "Red::Driver::SQLite", driver-name => Str, name => Str, data => Any, model => Bla, origin => Red::Model, error => X::Red::InvalidTableName.new(table => "---invalid---", driver => "Red::Driver::SQLite"))
'---invalid---' is an invalid table name for driver Red::Driver::SQLite
in method create-table at /Users/fernando/Red/lib/MetamodelX/Red/Model.pm6 (MetamodelX::Red::Model) line 258
in method create-table at /Users/fernando/Red/lib/MetamodelX/Red/Model.pm6 (MetamodelX::Red::Model) line 253
in block <unit> at -e line 6
I've added the error field because it would be good to have the AST and the error in case of error.
I've removed red
and now Red
has the events
method
We'll need a place to add the bind information of the AST...
As much as I thought about it, Event Is requiring another attribute for this.
Maybe a Red::Event::AST and a Red::Event::Generic? That way we could use the type to filter...
class Red::Event {
has Red::Driver $.db;
has Str $.db-name;
has Str $.driver-name;
}
class Red::Event::AST is Red::Event {
has Red::AST $.ast;
has @.bind;
has Red::Model $.orig;
}
class Red::Event::Generic is Red::Event {
has $.data;
}
Sent with GitHawk
Does it really makes sense multiplying entities here? @.bind
would just remain uninitialized for non-AST events. My point is that with your proposal it would be necessary to do two-level check: first – for AST/Generic type; then, for Generic
, test for data type.
Or, if you don't want to pollute pure Event
with extra attributes, perhaps more elegant would be to have Red::Event::Bindings
role and mix it into a new Event
object for AST. I.e. like:
my $event Red::Event.new(data => $ast, ...) but Red::Event::Bindings[@binds];
Though its a bit of complication from inside, for a user it'd be clear that if $ev.data ~~ Red::AST
then $ev.bind
is there for inspection because $ev ~~ Red::Event::Bindings
too.
Besides, this approach is easy to extend for other data types if necessary.
use Red;
Red.events.tap: -> $ev { say $ev.data }
my $*RED-DB = database "SQLite";
model Bla {
has UInt $.id is id;
has Str $.name is column is rw
}
Bla.^create-table: :if-not-exists;
my $bla = Bla.^create: :name<test>;
$bla.name = "bla";
$bla.^save;
Bla.^all.grep({ .id < 10 and .name.starts-with: "bla" }).Seq;
sleep 5
Red::AST::CreateTable:
bla.id
bla.name
Red::AST::Insert:
name
id
Red::AST::Select:
(Bla)
Red::AST::Eq:
bla.id
1
Red::AST::Update:
name
Red::AST::Select:
(Bla)
Red::AST::AND:
Red::AST::Lt:
(bla.id)::num
10
Red::AST::Like:
bla.name
bla%
BTW, how hard is it to get actual SQL from AST for an end-user?
Just question of calling the driver's translate method passing the AST.
And I was wrong... We do not store the bindings... it's just a thing after the AST translation.
% perl6 -Ilib -e 'use Red <red-do>;
Red.events.tap: { dd .db.translate: .data }
red-defaults :default(database "SQLite");
red-do {
model Bla { has UInt $.id is id; has Str $.name is column is rw }
Bla.^create-table: :if-not-exists;
my $bla = Bla.^create: :name<test>;
$bla.name = "bla";
$bla.^save;
}
sleep 5'
("CREATE TABLE bla(\n id integer NOT NULL primary key ,\n name varchar(255) NOT NULL \n)" => [],)
"INSERT INTO bla(\n name\n)\nVALUES(\n ?\n)" => ["test"]
"SELECT\n bla.id , bla.name \nFROM\n bla\nWHERE\n bla.id = 1\nLIMIT 1" => []
"UPDATE bla SET\n name = 'bla'\nWHERE bla.id = 1\n" => []
Just perfect for human-readable logging and possibly additional debugging!
Perhaps. But debug could require more than just dumping database events. Debug stream might include intermediate messages of state changes, dumps of local variables or whatever else could be considered useful. Also, file names and line numbers for the purpose of distinguishing identical messages.
I can't say much in this area because I don't really know what exactly debugging does in Red. But my guess would be that additional debug event would be useful. The event could be emitted by a debug
method. Something like:
role Red::Event::Debug {
has Str $.file;
has Int $.line;
has $level; # debug level, could be of an enum type
}
method debug($level, Str:D $msg) {
Red.emit: Red::Event.new(:data($msg)) but Red::Event::Debug[$caller-file, $caller-line, $level]
}
The good thing about incorporating debugging into event subsystem is that it allows a user to plug in any reporting mechanism he/she likes. For example, events could be redirected to a WebSocket as JSON and be used by a front-end developer to observe what's going on in the backend.
PS. That's funny that I also thought about events partially duplicating what debug does. But didn't want to overload you with something beyond the current task. But it was too much on the surface to get around... ;)
now, every db meta-method can receive :with<db>
passing the database to be used, and ResultSeq
has a .with
method to set the database that should be used.
https://colabti.org/irclogger/irclogger_log/perl6?date=2019-09-17#l157