orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.74k stars 870 forks source link

Cron like Scheduler class #1226

Closed mattaylor closed 11 years ago

mattaylor commented 11 years ago

Proposal for a system scheduler class (OSchedule or OCronjob) which contains links to OFunctions together with a Linked Class, an argument array, and Cron job style rules.

These functions would be called periodically according to the cron rules and the output logged in an logging class (OLog or OChronLog) as records with a timestamp and linked back to the calling Schedule job.

These functions could be used to perform many very usefull thinkgs like

pellyadolfo commented 11 years ago

what can be the advantages of integrating, for example, a http://quartz-scheduler.org/ into the OrientDB core vs running it as an independent sotware? Any options to make it pluggable at will?

mattaylor commented 11 years ago

Seems like a good way to go, but I think it is important the the functions should be defined as OFunction's and registered using a system class like OSchedule, to make it easy to manage everything remotely via the studio.

lvca commented 11 years ago

Hey @mattaylor, this is another good point! @pellyadolfo I usually think 10 times before adding a new library to OrientDB. Since all we would need form Quartz is its cron like syntax, I think we could implement it by our own or find a lighter library.

pellyadolfo commented 11 years ago

Ok, Luca I agree with a custom engine as quartz is oversized.

lvca commented 11 years ago

I found this library of 211kb: http://www.sauronsoftware.it/projects/cron4j/manual.php#p14

Could it be enough?

mattaylor commented 11 years ago

sounds about right to me. Probably also need a property in the scheduler record to represent the user to run the function as.

Thanks, Mat

On Dec 24, 2012, at 1:24 PM, Luca Garulli notifications@github.com wrote:

I found this library of 211kb: http://www.sauronsoftware.it/projects/cron4j/manual.php#p14

Could it be enough?

— Reply to this email directly or view it on GitHub.

lvca commented 11 years ago

I moved this issue in the next 1.4 because it's easy to implement and offer huge gain for OrientDB Web apps.

mattaylor commented 11 years ago

We got this one. Henry will be requesting a pull early next week. As a heads up we are only using a couple of classes from cron4J which is LGPL so some kind of attribution notice will be required.

lvca commented 11 years ago

Hey, cron4j sounds good. Waiting for such pull request ;-)

giastfader commented 11 years ago

Hi, I'm following this thread and I was wondering what will happen in a distribute scenario. I.E. in a cluster scenario data in the new OSchedule class will be duplicated across the nodes, each node will attempt to execute the scheduled tasks and will perform the associated functions, possibly, because of this, modifying data in other classes, which in their turn, should be propagated to other nodes, producing conflicts or something similar. Could this be a real problem? How to avoid this?

mattaylor commented 11 years ago

Only one thread will be run per job. The state of the job will also be stored on the job record, ie run, stop, suspended etc... There will also be a flag to indicate that the job should be run on startup.

Sent from my Verizon Wireless 4G LTE Smartphone

----- Reply message ----- From: "giastfader" notifications@github.com To: "nuvolabase/orientdb" orientdb@noreply.github.com Cc: "Mat Taylor" mat.taylor@gmail.com Subject: [orientdb] Cron like Scheduler class (#1226) Date: Tue, Apr 2, 2013 7:15 AM Hi,

I'm following this thread and I was wondering what will happen in a distribute scenario.

I.E. in a cluster scenario data in the new OSchedule class will be duplicated across the nodes, each node will attempt to execute the scheduled tasks and will perform the associated functions, possibly, because of this, modifying data in other classes, which in their turn, should be propagated to other nodes, producing conflicts or something similar.

Could this be a real problem?

How to avoid this?

— Reply to this email directly or view it on GitHub.

giastfader commented 11 years ago

Yes i understand that there will be one thread per job. But which node should start a job? Since the jobs data are in the OScheduler class which is present in all nodes participating a cluster, with the same information, all nodes will try to execute the same job at the same time.

Do I miss something?

2013/4/2 Mat Taylor notifications@github.com

Only one thread will be run per job. The state of the job will also be stored on the job record, ie run, stop, suspended etc... There will also be a flag to indicate that the job should be run on startup.

Sent from my Verizon Wireless 4G LTE Smartphone

----- Reply message ----- From: "giastfader" notifications@github.com To: "nuvolabase/orientdb" orientdb@noreply.github.com Cc: "Mat Taylor" mat.taylor@gmail.com Subject: [orientdb] Cron like Scheduler class (#1226) Date: Tue, Apr 2, 2013 7:15 AM Hi,

I'm following this thread and I was wondering what will happen in a distribute scenario.

I.E. in a cluster scenario data in the new OSchedule class will be duplicated across the nodes, each node will attempt to execute the scheduled tasks and will perform the associated functions, possibly, because of this, modifying data in other classes, which in their turn, should be propagated to other nodes, producing conflicts or something similar.

Could this be a real problem?

How to avoid this?

Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHubhttps://github.com/nuvolabase/orientdb/issues/1226#issuecomment-15778642 .

Sent from my Commodore 64

mattaylor commented 11 years ago

Right now the only way to control the scheduler in a cluster is via the config files for each instance. The scheduler should be enabled in one instance but not in the others. Once we have a little more clarity and stability with the way the DHT index is implemented we can be a little smarter about about this.