facebookarchive / bistro

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
https://bistro.io
MIT License
1.04k stars 158 forks source link

How to use bistro as workqueue? #17

Closed yinlinzh closed 7 years ago

yinlinzh commented 7 years ago

Hi Developers,

I'm working on setting up distributed task scheduling system to achieve data migration from HDFS to local file system. The example use case is querying the namenode to get file list, then create and schedule the task for each individual file. In README, the example code shows that scheduling and executing one task which written in script. Here are some of my questions:

  1. Is there any API list for other any language that integration with bistro?
  2. How to use bistro as workqueue? In other words, after creating each task, how do I keep those created tasks? I noticed bistro can talk with MySQL, HBase and Postgres. Would you please provide more information about this?
  3. My plan is to run "task agent" on each "worker", instead of shell script way, something there's daemon process running on worker and waiting for scheduling from scheduler. Would you please some examples?

Thanks in advance!

snarkmaster commented 7 years ago

1) What sort of integration are you looking for?

2) A few people have used Bistro as a work-queue in the past.

Are you trying to queue many sharded jobs, or to queue per-shard tasks for a single job? Can you give more detail?

The typical pattern for a queue of many jobs would be:

The typical pattern for a queue of many shards / tasks would be similar:

3) I don't really understand your question. Can you try describing what you are trying to do, and how you want to do it in more detail? I assume you are aware of the binaries bistro_scheduler and bistro_worker, and how they interact? If not, you should try --help with both, and skim the protocol documentation:

snarkmaster commented 7 years ago

@yinlinzh, please reopen if you would like to resume this discussion :)