Closed Strate closed 8 years ago
It isn't supposed to wait... the pipeline is reactive and sends updates as they happen. Maybe you need the debounce
plugin if you want to collect events. Isn't it better if your events get written as they happen instead of the build system forcing things to wait?
It isn't supposed to wait
Okay, how to do that?
I need to wait for completion of all build stages and then pass all events (which are compiled files) to webpack (I am working on plugin now). Webpack need all compiled assets to bundle them.
And, if it not supposed to wait, how mocha
pipeline from example works?
pipelines['tests-run'] = [
pipeline('build-source', 'build-tests'),
debounce(500),
mocha({ files: 'lib/**/*.spec.js' })
]
So, it does not wait for build-source
and build-tests
stages completion. What happens if tests already built, and sources no, mocha runs on half-built system?
Check out the documentation for debounce
As for waiting for completion, then consider how the -w
mode might work? Sigh is reactive, I do not think webpack's API is.
Documentation of debounce says:
Combines events in the pipeline until the event stream settles for longer than the given period.
Imaginate situation, connected to test-run
pipeline. What if there is a huge file, which compilation took more than 1 second. It means that there would be no events whithin some time, and debounce
consider it delay as signal to invoke mocha. But this is wrong, isn't it? mocha should run after, and only after whole compilation completes, not after delay since last event.
If it is not possible, seems that sigh could not be used for complex projects :-(
And, this is VERY unclear from documentation, really. I think that you could describe it explicitly.
It is a functional reactive build system, it can be used in very complex scenarios and supports incremental rebuilds at a level impossible in almost any other system. But you have to change your way of thinking. You aren't considering the reactive context at all if you think in terms of streams "ending". When you use -w
streams never end. I think it's wise to avoid making incendiary remarks about a domain you are not yet familiar with.
Okay, let forget about remark about complex projects, keep discuss cool.
How can I ensure, that all my files are compiled and stream contains all files compiled? Something like that?
[
glob('src/**/*.*'), // convert all files to events
typescript(), // compile all ts files
stylus(), // compile all stylus files
babel6(), // compile all js and jsx files
webpack() // Here I expect a stream, fullfilled with events which are compiled files. Webpack plugin converts 100 assets to 1 js file
write("target") // asset file, produced with webpack comes to target folder
]
Am I miss something?
It requires quite a lot of words to explain, I'll come back to this in a few days when I have more time.
NP, I'll continue play around with it, thank you for your participation!
Definitely check out some articles about Functional Reactive Programming
first, sigh is designed along these principles. Also articles about Incremental Rebuilds
.
This one is pretty good: https://gist.github.com/staltz/868e7e9bc2a7b8c1f754
On initial state glob
populate stream with one event, which is array of file events.
And when this event passing through consumers, every consumer accepts this one event with all files.
This works perfectly with one flat pipeline:
pipelines.buildts = [
glob('src/**/*.ts'), // emits one stream event, which is array of "ts" file events
typescript() // accepts one stream event, which is array of "ts" file events, emits one stream event, which is array of compiled "js" files
write('target') // accepts one stresm event, which is array of file events, emitted from previous consumer "typescript"
]
But on distributed pipelines it is not so clear:
pipelines.buildts = [
glob('src/**/*.ts'), // emits one stream event, which is array of "ts" file events
typescript() // accepts one stream event, which is array of "ts" file events, emits one stream event, which is array of compiled "js" files
]
pipelines.buildcss = [
glob('src/**/*.ts'), // emits one stream event, which is array of "css" file events
postcss(), // accepts one stream event, which is array of "css" file events, emits one stream event, which is array of compiled "css" files
]
pipelines buildall = [
pipeline("buildts", "buildcss") // this will emit *two* stream events, each one event emitted from previous pipelines
write("target") // accepts *two* stream events, each one is compiled "ts" or "css" files
]
And while for write
consumer there is no problem to handle two events, beacuse it can handle them separately, other consumers (concat
, mocha
, webpack
, browserify
and others) needs to get all files from previous pipelines together, not separately.
One of solution is debounce
consumer, it performs concatenation of stream events to one stream event (performing array.concat
operation). But debounce
emit event after some delay between events. And this could be a problem on initial phase: if there is a big delay between events (for example, one of previous pipeline is complex and completes much later than previous), debounce will emit event without files from that pipelines. And this gonna to be a problem for consumers listed above.
I think, that pipeline(1, 2, ..., n)
(or merge(1, 2, ..., n)
) consumer should waits for n
events on initial stage and concatenate them.
Hope that I do not missed something.
That change won't be made.
First try to understand how -w
mode works then we can talk about this
-w
mode emits one stream event per each file change, which is array of one file event, but does it matters? there is no problems with watch mode, at least because any event from -w
mode comes after initial phase of build. And any consumer can store files from initial phase to memory (i.e. to memory-fs), update files in that structure and performs compilation from memory on each event from watch mode.
Cool, that's almost correct. I guess that's why, instead of modifying pipeline/merge, I think a new plugin that combines events and forwards them all together with storing them in something like memory fs would be better. Your suggestion would break the -w case.
Although given it works off of streams it should instead just group events (taking remove events into account) and then add another plugin to deal with APIs that need an fs. APIs that are fs-only are annoying ;)
Your suggestion would break the -w case.
But why? I suggest to do wait/concatenation only for initial phase. It could be tested pretty simple. If there is watch mode just forward any event without awaiting anything.
And there is not about storing memory at all. It about to ensure, that all initial phase build completed.
Using the promise like that to concatenate the first set of events is nasty, should be using Bacon for that. The plugin should also coalesce events to support watch mode updates. Their is code to support this in sigh-core.
There is a big fat TODO on coalesceEvents
method https://github.com/sighjs/sigh-core/blob/master/src/stream.js#L98
toFilesystemState is the one. It keeps a catalogue of all events as they relate to the filesystem and passes them all along after receing each update.
coalesceEvents would be for filtering out unnecessary duplicates which you don't need in your context.
Okay, it could be easy added as the next consumer for stream. What about your remark about Promises. Could you give me some points to Bacon method which should be used? I am really not so close with it.
I have play around with it, together with a watch mode, seems that is works good.
And, from my point of view, this behaviour expected to be default behaviour of merge
/pipeline
builtins.
Adding toFileSystemState
was a bad idea, because this turns one file update event to N file events, which are always bypasses to filesystem.
Am confused, you want to pass "all events" down the stream in the first payload, but then subsequent payloads only contain the updates? I can't quite figure what would benefit from this hybrid/inconsistent approach to the pipeline.
I can't quite figure what would benefit from this hybrid/inconsistent approach to the pipeline.
Sad to read this, I have tried to describe it several times with different examples in this thread. I'm gonna to try it again.
Imagine that you have a quite big project. It contains a lot of files, you want to transpile them in parallel and then, only after all files transpiled, run webpack
/browserify
on them. Definetely that tools should run on all transpiled files, not on only the part, hope that it is really clear.
Lets try to implement the pipeline:
pipelines.build = [
merge(
[glob("src/**/*.ts"), typescript(), babel6()],
[glob("src/**/*.styl"), stylus(), postcss()]
),
write('target"),
webpack("target/src/index.js")
]
At first glance, this should works like this:
In a fact (and this is really not clear from docs/examples), it works like this:
This is definetely not the thing that I want. And this is definetely wrong: at first time webpack runs on half-populated target directory (without stylus files) and it will definetely fail.
You can tell me that debounce
is the answer, and I will tell you that not. Because debounce aggregate events during some delay in milliseconds, and this is easy gonna to race condition:
awaitAndMerge
plugin solves that problem in a right way: it does not operate with delay, it operates with count of child pipelines, waits for them and merge their results together, and webpack will run on fully populated target directory, without any race conditions or extra delays:
pipelines.build = [
awaitAndMerge(
[glob("src/**/*.ts"), typescript(), babel6()],
[glob("src/**/*.styl"), stylus(), postcss()]
),
write('target"),
webpack("target/src/index.js")
]
Of course on watch mode it should not buffer and wait for anything, because watch event can come only from one child pipeline. But this could not produce any error, because any watch events comes after, and only after initial build. And initial build always populates target directory with full bunch of files, so updates can come one-by-one.
And this behaviour was expected by me by default for builtin merge
/pipeline
I get that, instead of using fs based APIs it's better to use the API exported at the library layer. Depending on the target API you can either do this incrementally or not. Your plugin only works for fs-based APIs (maybe webpack doesn't export a programmatic API?). So to handle such APIs you're better of not using write at all... That's the plugin you want to replace, ideally with one that uses an in-memory fs.
So instead of awaitAndMerge(...), write(targetDir), webpack()
ideally you'd want something like collectInMemoryFs(), webpack()
.
collectInMemory fs would story the files/directories in say, /dev/shm
, and change the paths of the events to reference these memory files. Then the webpack()
API can work far more efficently.
However of course it's better if you don't have to use an fs based API at all. Then you could use something like your previous example... and then you'd definitely want to emit the total filesystem state because it would make dealing with the target API much easier (unless the target API is capable of incrementally building, then you wouldn't need the await
stage at all).
You seem to be getting pretty emotional about this, I think it's better to try to adjust your mindset to a new way of solving problems (in an FRP kinda way) than getting annoyed about why you can't push FRP into your idea of how things have traditionally worked.
How collectInMemoryFs
prevents running webpack
(or browserify
, or mocha
) twice (or more) on initial build phase?
collectInMemoryFs({ coalesceInitialEvents: 3 })
, or via two plugins: collesceInitialEvents(3), collectInMemoryFs()
.
awaitAndMerge
do right this thing: coalescing initial events, but instead of accepting count of that events, it accepts child pipelines and waits for pipelines.length
events. This behaviour is expected to be default for builtin merge
.
What about your notes for not using fs-based api. This is not about fs-base or not fs-based. This is about to run task after whole initial build finished. This could be really any kind of task, event not connected with fs at all.
btw, I have made a little refactoring for await-and-merge
https://github.com/Strate/sigh-await-and-merge/blob/master/src/index.js, now it correctly await for all initial phase completed, collected initial phase events are merged with probably raised non-initial events and all of them transformed via toFileSystemState
This behaviour is expected to be default for builtin merge.
As you've repeated many many times. However it really isn't a good idea, it would break many awesome incrementally rebuilding pipelines that have been built with sigh. You say you want it only because you're considering working with non-incrementally-rebuilding APIs like webpack. To keep on repeating you think the way it is designed is a mistake and it should be your way only shows you haven't considered many many possibilities.
it would break many awesome incrementally rebuilding pipelines that have been built with sigh
How exactly? This is only about initial build, rebuilding is not touched at all, isn't it?
As I've said many times, there is no concept of "initial build" in many pipelines. It's FRP man... FRP!!!
I will implement "await and merge" mode as an option to merge
. Check out https://github.com/sighjs/sigh/issues/46. You can check out the code for how to implement the algorithm using FRP and without using the promise like that.
when I run
sigh
, target folder populated bycompile:dev
pipeline after each requested pipeline finished, not after all finished:I have tried to rewrite
compile:dev
pipeline like that:and there is no difference. How to force
pipeline
plugin to wait all nested pipelines to complete? Or maybe I do something wrong?