Closed suhailshergill closed 8 years ago
BTW, the release process is documented here in case we want to include it somehow: https://github.com/ypg-data/sbt-mediative#releasing
that's a good idea @jonas @dmitri-carpov are you intending to assign this issue to yourself. let us know if you'd like us to assist in any manner
@suhailshergill I vote to close this as the readme now seems pretty fleshed-out.
@yawaramin please check mark all the things in the test section which are covered. if all of them are (i didn't think they all were), then you are welcome to close this issue
@suhailshergill @yawaramin @jonas could you please qa this? The second point was discussed and decided to put it out of the official documentation.
@dmitri-carpov If you want to have stuff reviewed or QAed I suggest you create PRs. By adding the text Fixes #2
the ticket would then automatically be closed when the PR is merged.
@jonas agree, my bad, I should have done it from the very beginning. This issue affects multiple commits now, all related to the README file. If I do a PR now it is going to cover just a part of it. What I'd like is a review for the whole documentation. Will do PRs in the future.
@dmitri-carpov Some notes:
Eigenflow is an orchestration platform which allows to build resilient and scalable data pipelines.
Eigenflow is an orchestration platform for building resilient and scalable data pipelines.
It is created for periodic long running ETL processes where restarting from the beginning in case of failures is critical.
Restarting is an optional and not critical functionality?
Eigenflow encourages process developers to split processes in stages which can be persisted and monitored automatically.
Pipelines can be split into multiple process stages which are persisted, resumed and monitored automatically.
Platform limitations:
Should be moved under Main Features
resolvers += Resolver.url("YPG-Data SBT Plugins", url("https://dl.bintray.com/ypg-data/sbt-plugins"))(Resolver.ivyStylePatterns) addSbtPlugin("com.mediative.sbt" % "sbt-mediative-core" % "0.1.1") addSbtPlugin("com.mediative.sbt" % "sbt-mediative-oss" % "0.1.1")
This is not needed
akka
Should be spelled Akka
Eigenflow, eigenflow
Pick one
I think it could also use a good read though to fix some grammatical errors
Other than that :+1:
(ETL) processes, (ETL) jobs ... Eigenflow encourages process developers to split processes in stages which can be persisted and monitored automatically. ... Hardly pays off for simple atomic jobs (one stage process).
let's limit confusion and refer to these in a standard way. this is a minor nitpick.
Does not provide connectors to 3rd party systems.
i would not call this a "platform limitation". it's out of scope. we may release helper libraries for various backends/connections
It is not a replacement for ESB or BPM systems, in cases when a very complex workflow involved and there is a need for UI to draw the processes it's better to consider another products.
i don't understand this, please elaborate and explain why.
Supports scala language only.
not that i particularly care, but couldn't other jvm languages wrap around our libraries? if so, should we clarify that
String => A
shouldn't this be String => Option[A]
? it may not yet be implemented as such today, but in that case please open up an issue. let me know if the motivation etc for the issue aren't clear and i can add details.
once created, please link the issue from the README
you currently have code examples in the README. how are you ensuring that they compile? should we have an examples module with code which compiles and link to it? should we use tut? does this go with
TODO: examples
we may want to be more explicit of the OSes we currently support and what the distinction is. specifically a portion would work wherever you can get scala to work, but then we also have some devops scripts specific to Mac OS (btw which Mac OS version do they support?). what's the distinction?
do we have travis builds? if so, could you add a badge for it to the README?
@suhailshergill
It is not a replacement for ESB or BPM systems, in cases when a very complex workflow involved and there is a need for UI to draw the processes it's better to consider another products. i don't understand this, please elaborate and explain why.
i don't understand this, please elaborate and explain why.
BPM assumes human tasks during the process (usually interactive forms), we obviously do not provide anything like that. All our steps (stages) are "system tasks" what make it closer to ESB systems. In case of ESB, if most of the components are written using SOA then ESB software usually provides high level tools for integration and orchestration where in our cases all services would have to be programmatically called and integrated. I see Eigenflow
somewhere between simple cron jobs and complex enterprise processes (where ESB is usually used).
Supports scala language only.
not that i particularly care, but couldn't other jvm languages wrap around our libraries? if so, should we clarify that
In theory yes but our DSL is written in scala, I think in java won't be that elegant at all. By support I mean API and clear documentation.
@dmitri-carpov i realize i was parsing the sentence incorrectly which led to some of the confusion. makes more sense now. personally i see that everything is programmatically called and integrated a strength, but i'm biased (coz i can code).
i do wonder how complex a workflow we would be able to handle and to what extent being able to visualize workflows would suffice.
@suhailshergill I'm not an ESB or BPM expert but from what I saw very little of them support extremely complex workflows, for example, conditional branches with "AND joins": when some steps can be executed in parallel but at some point the process waits for those to be done together (and what to do if one never ends).
In our current plan is to handle conditional branching: when workflow may have different paths and the path can be defined by some logic, also the skipRun
functionality should appear soon.
Regarding visualization, I think it makes sense when you have an infrastructure full of different web services and you can re-use them and build different processes integrating them, we still target a lower level integration, therefore I think efforts hardly outweigh benefits.
Does not provide connectors to 3rd party systems.
i would not call this a "platform limitation". it's out of scope. we may release helper libraries for various backends/connections
Yes, "platform limitation" is probably not the best place for it, just wanted to make it clear because most of other ETL platforms provide built-in connectors, but I agree connectors should probably better be included as another dependency.
Motivation
To better communicate and establish what
eigenflow
is intended and good for we need to add more documentation which provides some example usecases and highlights some comparables.Input
Current README.md
Output
Updated README.md
Test
eigenflow
is designed to handle[ ] Comparison and contrast betweeneigenflow
and at least two other similar thingseigenflow
is not intended for