sighjs / sigh

multi-process expressive build system for the web and node.js, built using baconjs observables
209 stars 12 forks source link

full-stack example #4

Closed tcurdt closed 8 years ago

tcurdt commented 9 years ago

Would be great to have a more complex real world example with support for

insidewhy commented 9 years ago

I'm happy to help with any questions you may have, there are a few examples shown in the main README, on various sigh-plugin READMEs and in the presentation.

jsx is just an option to the babel plugin, there are many examples shown which use this sigh plugin, the options object is passed through directly to the babel compiler API, you can check their API documentation for these options. sass is shown on the sass plugin page, same story with the options object and their API. less/browser-sync/sitemap.xml may be supported with the gulp plugin compatibility layer, there are currently no sigh plugins for those things.

Let me know if you want me to go into more detail on any of those things.

insidewhy commented 9 years ago

The presentation is best viewed in chrome and in full screen (hit the F button) and navigated with the arrow keys, otherwise you'll miss many of the examples.

insidewhy commented 9 years ago

Oh I forgot to mention source maps, source maps are just supported out of the box, APIs are provided for plugin writers to make it trivial for them to transitively apply and concatenate source maps. As long as the underlying transformer API the plugin is supporting is capable of generating a one-step source map then the sigh plugin should also support it. All existing plugins support source maps and most gulp plugins should out of the box also.

insidewhy commented 9 years ago

Here's a few more good examples:

All the features used in those files are explained in the main documentation and in the documentation for the respective plugins used.

tcurdt commented 9 years ago

Thanks, already found the presentation.

The links are nice to give an idea but are nowhere near a full setup.

insidewhy commented 9 years ago

Many of them are links to sigh files being used to build various software on github right now, so they are real world examples, perhaps you were thinking more along the lines of a "full-stack" web example? If so I can dig up some examples of full-stack projects. I'm currently using it in a few projects of this nature alongside the jspm package manager.

tcurdt commented 9 years ago

Yeah - they seemed like simplified examples. But well - if the setup is simple then it's still real-world :)

Indeed - a full-stack example would be great.

insidewhy commented 9 years ago

Maybe they seem simplified because sigh is so expressive ;) If you were to replicate them in gulp/grunt then they would look a lot more substantial. Hehe.

insidewhy commented 9 years ago

@tcurdt The examples I can find together all make a great setup, but each one is missing something (one doesn't have tests, one doesn't use sass/less).

I think the best thing is for me to make a yeoman sigh-fullstack generator, then also document that example in the README. :rocket: kill 2 :baby_chick:

tcurdt commented 9 years ago

Documentation is certainly an option, too. Are there any drawbacks using gulp plugins?

insidewhy commented 9 years ago

@tcurdt sorry for the delay in responding I've had the flu. All gulp plugins that are single stream transformers are supported and you shouldn't find any drawbacks. Gulp plugins that supply multiple methods that need to be used together are not supported.

insidewhy commented 9 years ago

Added browser-sync plugin: https://github.com/sighjs/sigh-browser-sync

tcurdt commented 9 years ago

Great!

I am still failing to wrap my head around the plugin API though.

I guess implementing less I could probably my looking at the sass plugin. Same for markdown support.

Where I am lost is templating (e.g. swig) and passing in collections to the templates (like https://github.com/segmentio/metalsmith-collections) which could also be used for rss/atom feeds and the sitemap.xml.

tcurdt commented 9 years ago

I've been trying the following

{
  "devDependencies": {
    "sigh": "^0.12.18",
    "sigh-babel": "^0.11.5",
    "gulp-markdown": "^1.0.0"
  },
  "dependencies": {
  }
}

module.exports = function(pipelines) {

  pipelines.md = [
    glob('posts/**/*.md'),
    markdown(),
    write('build')
  ]

  pipelines.js = [
    glob('src/**/*.js'),
    babel(),
    write('build/assets')
  ]

  pipelines.alias.build = [ 'js', 'md' ]
}

but when running I get

$ sigh -w

sigh-test/node_modules/sigh/src/Event.js:83
      this._sourceMap.sourcesContent = [ this.sourceData ]
                                    ^
TypeError: Cannot set property 'sourcesContent' of undefined
    at _default._createClass.get (sigh-test/node_modules/sigh/src/Event.js:83:37)
    at sigh-test/node_modules/sigh/src/gulp-adapter.js:50:32
tcurdt commented 9 years ago

Same with the gulp-marked plugin.

insidewhy commented 9 years ago

Instead of: glob('src/**/*.js') you probably want glob({ baseDir: 'src' }, '**/*.js') otherwise the src directory will be used in the output folder. As for the error you're getting there... sorry about that, I just fixed it and release version v0.12.19. It's another issue to do with supporting files where identity source maps cannot be calculated.

Let me know if there's anything you else standing in your way, I'm happy to provide as much support as you need!

insidewhy commented 9 years ago

Oops didn't mean to close it, I'll have a full-stack example ready for you next weekend.

insidewhy commented 9 years ago

I am still failing to wrap my head around the plugin API though.

It's definitely worth reading a tutorial on functional reactive programming if you haven't already, especially one on Bacon.js, the FRP library sigh uses. It takes a different kind of mindset to how you might be used to working with javascript. More like writing Haskell. The existing plugins also use Promises a lot, you should definitely be aware of composition of promises.

The costs of learning FRP are higher but once you know it the rewards you can gain back in writing neater, shorter, more understandable and maintainable code are worth it.

tcurdt commented 9 years ago

Awesome. That release got me much further.

I am not new to reactive programming. For example I fell in love with ReactiveCocoa quite a while ago - but I will have a closer look into Bacon.js as well.

The intro at https://github.com/sighjs/sigh/blob/master/docs/writing-plugins.md is a good start and looking at https://github.com/sighjs/sigh-sass/blob/master/src/index.js it's quite clear what happens - but it seems to be working completely on a single file level.

So I looked at https://github.com/sighjs/sigh/blob/master/src/plugin/concat.js and got a bit lost. What should be trivial seems like quite some work - and frankly I just don't get https://github.com/sighjs/sigh/blob/master/src/plugin/concat.js#L15 with the opTreeIndex for example. And why is there a collection of events here https://github.com/sighjs/sigh/blob/master/src/plugin/concat.js#L21? Is this what is getting buffered with the debounce?

And how could a plugin provide input to another plugin? For example providing a collection (like blog posts) to a template engine.

insidewhy commented 9 years ago

opTreeIndex for example

Actually concat is probably the only plugin that needs to worry about that. opTreeIndex relates to the depth-first index within the tree of streams of the plugin that created the source. The concat plugin uses it to ensure the files are concatenated in the same order, this is really good as it allows a reliable order and for you to configure your order by e.g. changing the order in which glob expressions appears in a glob.

why is there a collection of events here

In sigh the stream payload is an array of events rather than a single event. This allows plugins to much more easily support n-m operations (in gulp it's very hard to support anything but 1:1 operations, as you've probably read in broccoli's criticisms of gulp).

Usually operations are only sent down the tree relating to changed files, the toFilesystemState ensures the event array contains an event for every single source tree entry, because of course the concat plugin needs to concatenate all files, not just those that have changed.

Is this what is getting buffered with the debounce?

If two event are emitted 100ms apart (which of course is an array of one event each time due to what I was just discussing), and the debounce interval is set to 150ms, then you will get an array of two events 150ms after the second event, rather than receiving each event individually.

And how could a plugin provide input to another plugin?

Plugins forward events down the tree, plugins further down the tree receives events from plugins before them in the tree. e.g.

[ glob('*.js'), babel(), write() ]

glob sends events to babel, babel receives those events and transforms them, then write receives the transformed events. You can also use merge to fork the tree and recombine events, filter to remove events. And pipeline to connect events together.

insidewhy commented 9 years ago

About opTreeIndex

[
  merge(
    glob('*.js'),
    glob('client/*.js')
  ),
  concat()
]

Events from glob(*.js) have opTreeIndex of 1, events from glob(client/*.js) have opTreeIndex of 2. Therefore concat knows to order files matching the pattern *.js before the files matching the pattern client/*.js when it concatenates all the files.

insidewhy commented 9 years ago

BTW everything I've mentioned in my last two messages can also be found in the plugin writing guide.

tcurdt commented 9 years ago

Well, without knowing a bit more about how sighjs works under the hood this explanation "depth-first index of operator within pipeline tree. This can be written to in order to this to set the treeIndex for the next pipeline operation otherwise it is incremented by one." wasn't quite that clear - your two messages where much(!) better in explaining it.

OK - I think I got the general idea how it works down now :)

tcurdt commented 9 years ago

About passing down information. The general passing down of events is clear - but let's say I wanted to implement something like https://github.com/segmentio/metalsmith-collections then I would have a plugin globbing files as input, the plugin would have to cache the collection (and keep it sorted) and on changes would pass on a special event with the sorted collection of metadata as content.

The events from the collection and from the normal template pipeline somehow have to be merged to make them available to the template plugin.

Did I get that somewhat right?

insidewhy commented 9 years ago

You can pass whatever you want down the stream, as long as the plugins further down the stream can deal with that payload. In most cases you'd want the payload to be an array of Event objects, but it doesn't have to. You could also attach new fields to the Event objects containing whatever metadata you need. These fields could even all reference the same object, allowing a subsequent plugin to read the metadata from any event.

insidewhy commented 9 years ago

It might even be worth introducing a new CustomEvent type also, each one would contain a tag (detailing what kind of event it is) along with a custom data object. This might be easier for plugins to ignore by default.

tcurdt commented 9 years ago

As long as they are passed by reference the collections could be set as metadata on every event - but I am not sure that really works well. Let's say I have a blog post and change just the date. The only thing that really changes is the order in the collection. As the dependency is on the collection - what event should be passed to trigger all dependent updates?

insidewhy commented 9 years ago

Well all this plugin really does is update some global metadata. Sigh could provide the same thing, but yeah... creating global metadata as part of a build system flow doesn't seem like a good idea to me (global state and functional programming are like hot oil and ice cubes). I'd rather see metadata passed down the stream in some other way. I mean sure a plugin could still cache some state of its own, but I'd rather see it pass a copy of that state down the stream than make it available globally. Say what if you have two streams each with their own collections? Then the global states could interfere with each other.

To come up with a better solution though I'd want to see how the metadata created by the collections plugin is actually used.

Let's say I have a blog post and change just the date. The only thing that really changes is the order in the collection.

Well when you change that file you'd probably want to pass an event representing that change down the stream anyway, so it doesn't seem so bad to attach the metadata to it. Then again maybe you don't care about forwarding the input events, in which case you could have the plugin only pass on custom metadata events?

tcurdt commented 9 years ago

Then let's have a look at an example. Here is a snippet from a metalsmith pipeline

.use(collections({
  posts: {
    pattern: 'posts/**',
    sortBy: 'date',
    reverse: true
  }
}))
.use(feed({
  collection: 'posts',
  destination: 'feed.xml'
}))

It should be obvious what's going on there. We define a collection of blog posts and the feed gets a reference to it. Same goes for a template that could be an archive page:

<article>
    <ul>
        {{#each collections.posts}}
            <li>
                <h3>{{this.title}}</h3>
                <article>{{this.contents}}</article>
            </li>
        {{/each}}
    </ul>
</article>

What happens if you attach the collection as meta data to the event?

So let's say I change the date of a blog post and save. The save event trickles down the sighjs pipeline. The relevant blog post page gets updated because there is an easy reference from the event. What about the archive page though? The archive page is only connected to the collection.

insidewhy commented 9 years ago

How about:

pipelines.blog = [
  glob('*.md'),
  collections(...),
  merge(
    feed('posts'),
    archive('posts')
  ),
  write('build')
]

So the collections only passes special metadata events down the stream, then subsequent plugins can turn that metadata into Event objects representing files to be written. In this case the merge forwards the same metadata to two plugins, then merges the output of these plugins back together to send down the stream.

insidewhy commented 9 years ago

Or instead of archive maybe it should be something like collectionTemplate(...) which could take both normal Event objects as well as metadata objects, it would then apply the metadata to the Event objects via the template system.

insidewhy commented 9 years ago

Like um:

pipelines.blog = [
  glob('*.md'),
  collections(...),
  merge(
    feed('posts'),
    [ glob('templates/*.html'), collectionTemplates() ]
  ),
  write('build')
]

This relies on the fact that glob plugins also forward their inputs down the stream in addition to creating events based on the supplied pattern(s).

tcurdt commented 9 years ago

The feed is generated solely based of the special events generated by the collections plugin. The archive would really be a template plugin that should react to several events. Mainly the template file and then the collection (plus there might also be reference inside the template to other template files/partials).

You were fast than me :) but yeah the last one looks about right

tcurdt commented 9 years ago

Probably a little more like this though:

pipelines.blog = [
  glob('*.md'),
  collections({
    posts: '*.md'
  }),
  merge(
    feed('posts'),
    [
      glob('templates/*.swig'),
      template('post.swig', {
        collection: resolve_collection('posts'),
        something: 'bla'
      })
    ]
  ),
  write('build')
]

Not sure how to make the reference to the collection clear. Explicit as parameter is ugly too:

template('post.swig', [
    'posts' // collections
  ],{
    something: 'bla' // other context values
  })
insidewhy commented 9 years ago

You pass in post.swig as like, the "master" template name? Then the other events I guess would be used for references posts.swig makes to other templates?

pipelines.blog = [
  glob('*.md'),
  collections({
    posts: '*.md'
  }),
  merge(
    feed('posts'),
    [
      glob('templates/*.swig'),
      template('post.swig', {
        collection: 'posts',
        context: { ... }
      })
    ]
  ),
  write('build')
]
insidewhy commented 9 years ago

Or name them all:

      template({
        root: 'posts.swig',
        collection: 'posts',
        context: { ... }
      })

You could also use baseDir in your template glob depending on how you want to structure your built directory tree.

tcurdt commented 9 years ago

Exactly. All posts should use the post.swig template. If we had the full dependency tree the glob to *.swig would not be necessary. Since don't have that information - write on all changes.

tcurdt commented 9 years ago

root feels wrong - although I understand where you are coming from. It's really just the template (that might or might not include other templates/partials).

Given that the context and the collection would need to be merged into a single context anyway (as that is a common interface for template engines) it feels quite ugly to use the explicit naming approach.

This is what it should look like for the template:

{
  collection: {
    posts: [
      { title: "", date: "", excerpt: "", ... },
      { title: "", date: "", excerpt: "", ... }
    ]
  },
  something: 'bla'
}
tcurdt commented 9 years ago

btw: I am working on a comparison project over here

https://github.com/tcurdt/site-boilerplate

Still debugging the gulp stuff and still have to finish up the metalsmith setup. If you'd work out with the sighjs setup - that would be fantastic. Should at least give a baseline for the full-stack example.

insidewhy commented 9 years ago

Oh I've been working on this all weekend actually, I meant to give you a status update! Here's an example full-stack sigh file:

var merge, env, pipeline, debounce, select, reject
var glob, concat, write, babel, uglify, process, sass, browserSync, mocha

module.exports = function(pipelines) {
  pipelines.alias.build = [ 'client-js', 'css', 'html', 'server-js' ]

  // client side:
  pipelines['client-js'] = [
    glob({ basePath: 'client' }, '*.js'),
    babel({ modules: 'system' }),
    env(
      // TODO: use sigh-jspm-bundle instead
      [ concat('app.js'), uglify() ],
      'production'
    ),
    write({ clobber: '!(jspm_packages|config.js)' }, 'build/client')
  ]

  pipelines.css = [
    glob({ basePath: 'client' }, '*.scss'),
    sass(),
    write('build/client')
  ]

  pipelines.html = [
    pipeline({ activate: true }, 'client-js', 'css'),
    glob({ basePath: 'client' }, '*.html'),
    // in development mode also inject the browser-sync enabling fragment
    env(
      glob('lib/browser-sync.js'),
      'development'
    ),
    debounce(600),
    // TODO: inject css paths and browser-sync fragment into html here using `sigh-injector`
    select({ fileType: 'html' }),
    write('build/client')
  ]

  pipelines['browser-sync'] = [
    pipeline('html', 'css', 'client-js'),
    browserSync({ notify: false })
  ]

  // server side:
  pipelines['server-js'] = [
    glob({ basePath: 'server' }, '*.js'),
    babel({ modules: 'common' }),
    write({ clobber: true }, 'build/server')
  ]

  pipelines['server-test'] = [
    pipeline('server-js'),
    pipeline({ activate: true }, 'mocha'),
  ]

  pipelines.explicit.mocha = [ mocha({ files: 'build/**/*.spec.js' }) ]

  pipelines['server-run'] = [
    pipeline('server-js'),
    reject({ projectPath: /\.spec\.js$/ }),
    process('node build/server/app.js')
  ]
}

The idea is to use HTTP2 for development, that way you can split up and cache individual files without the performance delays of the HTTP1 headers.

I just wanted to add three more things before turning it into a yeoman generator:

insidewhy commented 9 years ago

BTW maybe you're aware but your gulp example there would be horribly inefficient as it rebuilds the entire system on stream changes. You'd need to use gulp-cached, gulp-remember and also probably gulp-order to fix it. This is one of the things that really cheesed me off with gulp actually.

insidewhy commented 9 years ago

It's also much more common to alias require('gulp') as the variable gulp rather than Gulp.

tcurdt commented 9 years ago

@ohjames aware of the horrible re-building behaviour. For now I was just trying to focus on getting the collection thing working. But right now the stream splitting and then merging seems to be a major problem. The merge2 or combined-stream2 is not working as documented. The more I work with gulp the more I want to run away screaming :)

Thanks a lot for the better example! For my needs it's still lacking templating and collections though. Happy to working on it myself - but might need a little guidance.

insidewhy commented 9 years ago

Merging streams doesn't really work very well in gulp, I raised an issue against gulp showing that the source maps get corrupted when merging streams. The gulp author was really rude though so I abandoned the bug and have no idea if it's been fixed.

tcurdt commented 9 years ago

At last it's not me then. I don't know if it's the same thing - but for what I am seeing is that when I split the stream subsequent transformations affect both streams (although they should now be separate - according to the gulp folks). Then merging them back is yet another problem.

sighjs feels so much better - but especially the collection handling (as we discussed before) seems tricky.

For gulp I had to fork most the plugins anyway as they didn't really work as I needed them. So having to write new ones for sighjs is no longer really a con when comparing the two.

But the collection thing is really most crucial part to implement the boilerplate.

tcurdt commented 9 years ago

Hm. I don't understand how to rename/move files yet. There is changeFileSuffix(targetSuffix) but that's just not enough.

insidewhy commented 9 years ago

sigh, like gulp, passes data in memory. You can modify the events or create new events that get passed down the pipeline, then the write plugin will write the events it receives to the fs according to the projectPath field of each event it receives. You can also use seject/reject to filter events before they reach the write as shown in the example.

insidewhy commented 9 years ago
glob --(turns files in the fs into Events)--> transform --(modifies or creates new events)--> write (writes events to the fs)
tcurdt commented 9 years ago

I see so changing the projectPath (that attribute name wasn't obvious) https://github.com/sighjs/sigh/blob/master/src/Event.js#L48 should do it.

insidewhy commented 9 years ago

Yeah, the path is made up of basePath + projectPath. You used to only be able to set the path and the other two were read only properties, since 0.12.20 you can set any of them and the others will update.