Stylus Sourcemaps - Githubissues

jescalan commented 10 years ago

This should be integrated into the stylus API: http://learnboost.github.io/stylus/docs/sourcemaps.html

notslang commented 10 years ago

Actually, we need a way to do this with CoffeeScript too. Any idea how we should extend the interface to support this? Changing the return value of the promise would be really lame to just include the optional source map (promises can't return multiple values)... so I'm not totally sure how we should deal with this. When we ran into this same issue in #37 we solved it by using an event emitter to deal with logging as a separate thing, but source maps aren't as decoupled as logging is... they end up being written to the disk just like the text resulting from the compilation operation.

jescalan commented 10 years ago

Yeah definitely valid -- I mean we can use the progress event to emit multiple values but it's a little hacky. This is sort of why I'm starting to think that using FRP might actually be a better interface than promises in a lot of places. The issue there is just that it's not familiar to most developers, unfortunately.

What do you think though? Just throwing a few ideas out there, we could use progress to return it, we could massively breaking change everything and return an object rather than a string from compiles, we could cache out the source maps then have another method call return that (or undefined) afterwards...

stephenlacy commented 10 years ago

:+1:

Returning an object does sound nice. {file: 'data', sourcemap: 'srcmap'}

Though promises are useful as well...

jescalan commented 10 years ago

Well, it would be a promise that returns an object. But I think the major problem with returning an object is that is it will 100% break everything else using accord right now, and that it will also be an unused extra step for the vast majority of compilers :confused:

stephenlacy commented 10 years ago

Quite right...

notslang commented 10 years ago

There's another solution that we could use, but it's almost as hacky, if not more hacky than, the progress events thing:

We could make another set of functions (compileSourceMap, compileFileSourceMap, renderSourceMap, renderFileSourceMap) that would accept the same args as the compile/render function that they're mirroring, but would return only the source map as a string. We would need to always compile/render with the source map (if the adapter supports it), and then cache that result in case the *SourceMap function gets called... so there's some complexity added there. Also, if the *SourceMap function got called before the regular render/compile function, then the *SourceMap function would need to call the compile/render function first, cache the result of the compile/render (in case the compile/render function is called afterwards), and then return the source-map.

So long as we're adding caching anyways, the functionality shouldn't be a huge problem. But the number of functions that accord would be exporting would be getting a little crazy. We would still be able to maintain the "one function = one result" idea, and we wouldn't break anything... it would just make it a bit odd.

Oh, and nice FRP article, @jenius ... I'm not sure how we could use that without breaking everything, but it looks like a good read.

jescalan commented 10 years ago

On FRP, we certainly couldn't, just pondering it for future projects.

For the cacheing thing, that's sort of along the lines of what I was thinking with the "cache out the source maps then have another method call return that" from the earlier comment. I feel like perhaps we could make a slightly smoother interface, like maybe just one sourcemap function that would return undefined or the sourcemap for the last file that was compiled, but yeah that's pretty ugly as well tbh. I feel like this direction is the best, just not sure what the smoothest implementation would be...

notslang commented 10 years ago

Yeah, that's similar, but the reason why I'd like to avoid that is because it makes the accord adapter highly stateful. The caching thing would make it stateful too, but in a way that is only reflected in performance. Having a sourcemap function that returns whatever the last sourcemap was, means you need to keep track of the last call to accord... Not only does that present a problem in regular programs where you need to make sure those calls happen one after the other, but in modular programs where accord could be required in multiple (unrelated) parts of the program, you would need to put a "lock" on calls to accord to prevent another call from going through after the initial compilation and changing the state, causing the wrong sourcemap to be returned.

antoniobrandao commented 10 years ago

Since gulp-stylus uses Accord, will solving this issue make gulp-stylus sourcemaps work?

jescalan commented 10 years ago

Yeah @slang800 that is definitely true :confused: I mean maybe we could just return it via progress? It's definitely not ideal but it would work and not break stuff...

@antoniobrandao yes it would

notslang commented 10 years ago

I think that using the compileSourceMap, compileFileSourceMap, renderSourceMap, renderFileSourceMap functions would be slightly less hacky than the progress events thing... it requires a bit of extra logic to make it stateless, but it keeps the interface clean, and would actually be really easy to work with in accord-watcher

jescalan commented 10 years ago

So these functions would exist and just not work for everything except for stylus and coffeescript though?

notslang commented 10 years ago

the functions would only exist when there is source map support... just like we do for the clientHelpers and compileClient functions

jescalan commented 10 years ago

Hmm ok -- anyone else have thoughts on this? @joshrowley @kylemac

hhsnopek commented 10 years ago

I like the use of objects for all the data, but it seems a little much for adapters that don't support source maps, but if we don't want a breaking change adding the functions would be next best route.

joshrowley commented 10 years ago

Returning an object sounds like a good idea to me as well, but I understand we don't want to have such a large breaking change. Something to consider for the next major release of accord, because I can see this being an issue down the road for some other new compiler feature, and an object would make it easier to support it without breaking changes.

Out of the two options discussed, I don't like either of them that much. My worry for the extra set of functions is having a lot of bugs dealing with returning the correct cached sourcemap for the correct compiler call. I could see problems arising if you're asynchronously calling it a bunch, which may be easier to handle with the event emitter based interface.

Either way though, I think @jenius and @slang800 have a better sense of the challenges in implementing either of these, so I trust you two to decide.

jescalan commented 10 years ago

I think what @slang800 is proposing is making duplicate methods for renderSourceMap etc. that return an object with the source map but not breaking the normal render methods... it's not the prettiest thing ever but perhaps it could serve as a temporary patch while we transition into returning objects?

joshrowley commented 10 years ago

Ah ok, so the object returned by renderSourceMap would contain both the compiled source and the sourcemap. renderSourceMap would eventually become the regular render function if we ever change the return value of render from a string to an object. Yeah, I like that.

notslang commented 10 years ago

Nope, the renderSourceMap would only return a string (the source map). What I'm proposing (in comparison to the similar method that @jenius proposed) is making those functions stateless, so you don't need to compile a file directly before you get the source map.

jescalan commented 10 years ago

The issue like @joshrowley said there is what happens when you are quickly compiling many files asynchronously? I wouldn't really want to have any opportunity for an incorrect result to be returned...

notslang commented 10 years ago

That was just an issue with the (accidentally shared) options object being mutated by one operation and then used by a different operation, right? Because if that's all it was then we don't have to worry because the cache would fully describe the job that the cached item was compiled/rendered for... so it wouldn't matter what order retrieval is done in.

The worst that could happen is one job checking for a cached item and finding that it isn't there, another job checking for a cached item and finding that it isn't there, and then both of them compiling the same thing (possibly by different means, if one is rendering regularly and the other is rendering for the source map) and storing it in the cache - resulting in a duplication of work. But that performance issue could only happen if querying the cache and updating the cache with the result of the job is not one synchronous operation, and even then it seems pretty unlikely.

...but good point @jenius, that's something worth adding tests for.

jescalan commented 10 years ago

Fair enough, as long as we can ensure the cache matches up I'm cool with this. Perhaps something like generateSourceMap would even be a more appropriate name here as well?

notslang commented 10 years ago

I like the generateSourceMap name better, but I think that we need compileSourceMap, compileFileSourceMap, renderSourceMap, and renderFileSourceMap because any of the functions except clientHelpers could have a source-map associated with them, right?

Actually, I think we'd need compileClientSourceMap, compileFileClientSourceMap too.

Also, some other problems w/ this interface:

The compile, compileFile, compileClient, and compileFileClient all return functions, but source maps are made to map strings to strings (not strings to functions). Should we make an assumption about the way that the function would be serialized and make the source map work with what Function.toString() would output, or some other standard serialization format? Or should we make all those functions return strings (which, while minor, is still a breaking change).
Source map files are just JSON, should the source map function(s) return a string of JSON, or the object itself. I don't really expect the object to be manipulated before being written to a file in most cases, but our other methods return deserialization formats like functions (rather than stringified functions), so it might be worth going with that convention. Also, some type of manipulation could become common, like minifying source maps, or passing them on to some other tool, so maybe returning an object makes more sense.
If a string already has a sourcemap associated with it, should we make a way of passing that to the function so the new operation augments the existing sourcemap (allowing mapping to go through multiple operations). Or should each function return its own sourcemap, and then the user needs to merge them together using some other function.

jescalan commented 10 years ago

Yeah the string issue is really difficult. Makes me think that maybe the progress hack is actually the way to go for now. And I agree with returning the actual object

jescalan commented 10 years ago

Wait can you even get a sourcemap back for a string? I guess you could... but would it really be that helpful?

notslang commented 10 years ago

what do you mean by "get a sourcemap back for a string"?

OverZealous commented 10 years ago

Any progress on this? I was trying to create a PR for gulp-stylus to support proper gulp-sourcemaps, but I effectively have to remove accord to get access to the renderer.

notslang commented 10 years ago

Yeah, we've got the interface almost figured out (besides those 3 issues in my 2nd-to-last) comment. I haven't had much time to work on this lately, but now that my current project is drawing to a close, I think I'll be able to tackle this issue next.

Oh, and feel free to join the discussion - it would be great to get some additional opinions on the new interface.

jescalan commented 10 years ago

Ok so here's the plan: We're going to push a breaking change and have it return objects. It's sad that it has to be breaking, but I think it's for the better in the long run. We'll just return an object that always has a compiled property that has the compiled content. I'd like to be able to complete and release this before the end of next week.

While we're making breaking changes as well, what are you guys' thoughts on returning streams by default rather than strings? For large amounts of content, this could be quicker and easier on the memory. Especially once we're also adding in sourcemaps or any other things that might come down the line, it could end up being a very large object. For those who don't want to deal with streams, we could have a buffer: true option that would buffer the streams into strings before returning.

notslang commented 10 years ago

Why do we want to do the object returning method?

That reduces composability because a function would be taking a string & outputting a nested object. So connecting another function would require a middle function to transform the output of the first into 2 strings. Plus, it's a breaking change.

Streams are a good idea... not that many (or any?) compilers actually support them, but it would be nice for the API. Also, this should probably be a separate issue.

Oh, also - those 3 questions I raised still stand for the "returning objects" way of doing things - we should decide on those no matter what method we're doing.

jescalan commented 10 years ago

Because it's the most reasonable and future-proof way to do things. It doesn't require any questionable and highly time consuming synchronization between other functions or implementations of caches etc. It's simple to understand, simple to use, simple to implement, and solves the problem entirely. For the composability, it's not something I'm super worried about. The alternative is not having sourcemaps which apparently a lot of people value, so if that means writing one line of code where you access the string from the object, such is life. Yes, it's breaking, but that's why we have semver. We'll slowly switch things over as needed.

For streams I actually want to say this can be supported later. Strings can be the default, if more compilers start adding stream support and/or we start seeing memory issues from large string/sourcemap pairs we can add an option to return as a stream to the options.

For your 3 questions:

I don't care about fuction.toString(). You would get back a promise for an object which contains at least one string. I don't see any issues with this at all.
Returning an object for the sourcemap sounds fine to me. You can always stringify it if you want.
I can't imagine a situation where you are compiling from a language and it already somehow has a sourcemap associated with it. If you haven't compiled yet, you don't have a sourcemap. If you are compiling from 2 different languages, it's on you to figure out how nested sourcemaps are going to work out. That's too much of an edge case for us to care about.

Will start working on this today, hopefully we can start testing this shortly!

notslang commented 10 years ago

In regards to the 3rd question & composability: Composability is incredibly important in a library like this because what you're providing is transformations of strings. There should be no reason why we can't chain multiple transformations together. This isn't even a close to an edge case: you compose 2 functions every time you do a CoffeeScript -> JS -> minified JS compilation. And, if you can't pass a source map through this process, you aren't able to map to production code. If we want to go with the method of returning an object, we should make all functions accept an object (the same type of object that the function returns).

In regards to the 1st question: I'm not sure if I explained this well enough. The issue is source maps map a starting string to an ending string. The problem is we return a function. A function could be serialized into a string in different ways, so there's not really a 1:1 map between functions and strings (unless we start making assumptions). So, we cannot have a sourcemap for a string -> function transformation. Thus, we either need to make an assumption about what the "true serialization" of that function is, or we need to return a string.

jescalan commented 10 years ago

Ok so for coffee -> js -> minify, you would need an inline sourcemap in order to have that work, correct? I guess inlining sourcemaps should be an option as well. But I'm ok with having functions accept an object as well if you think that would help.

I still don't understand what you're saying in the second paragraph, I'm sorry.

notslang commented 10 years ago

We shouldn't need to do an inline source map... I think that inlining should be an optional transformation that you'd do as a last step. That way we don't need to parse out, decode, & re-encode the sourcemap during every transformation. Also, I think most people want their sourcemap as a separate file.

Anyway, you'd just need to do something like this in order to pass a sourcemap through the transformation process:

transform = (inputString, inputSourceMap) ->
  outputString, outputSourceMap = compile(inputString)
  if inputSourceMap?
    outputSourceMap = mergeSourceMaps(outputSourceMap, inputSourceMap)
  return outputString, outputSourceMap

(where compile is the regular compiler that takes a string and returns a string + sourcemap)

For the 2nd part, which part did I lose you at?

jescalan commented 10 years ago

I guess my question was more along the lines of do most compilers actually accept another sourcemap separately and handle it correctly, but I guess that's not up to us. What are your thoughts on having the ability to pass in either a string or object -- if it's an object and it has a sourcemap and the compiler can handle sourcemaps as input, it will be passed in?

On the second part, I don't really get any part of it. Serializing functions is not something you can or should ever do, so I don't really think it's worth it for us to cater to this?

notslang commented 10 years ago

No, we wouldn't assume that any compilers accept a sourcemap - we'd be getting a sourcemap from the compiler, and then if we had a sourcemap passed in, we would merge the 2 source maps to get one sourcemap that represents all transformations that have been applied to the string (so far).

You need to serialize functions before you can write them to a file or send them to a web-browser (because you can't send abstract objects over the network - only strings that represent those objects)... It's something we do right now. Maybe "stringify" is a better word?

jescalan commented 10 years ago

Ok gotchya, that's fine. Are there libraries that will do this merge for us already?
Do you mean serializing function that have been made specifically to be able to be serialized for the client side? That's the only time you can actually serialize a function, otherwise you should never rely on it.

notslang commented 10 years ago

probs... I know that they did it for the clojurescript -> minified js compiler and it turned out beautiful
yes - like the ones that we return right now from compile, compileFile, compileClient, and compileFileClient

jescalan commented 10 years ago

Could you find one potentially?
Ok, perhaps we should say that you can't get a sourcemap back from those methods for now?

notslang commented 10 years ago

looks like this does what we want: https://www.npmjs.org/package/multi-stage-sourcemap
that would be really bad. I think we should switch to having those methods return a string

jescalan commented 10 years ago

Isn't that just eliminating those methods? We already have other methods that return a string...

notslang commented 10 years ago

no... compiling and rendering are very different operations

jescalan commented 10 years ago

So you are saying it should stringify the compiled function then?

jescalan commented 10 years ago

If you are, we can't. The functions that come out of compile cannot be stringified, that's what compileClient provides.

notslang commented 10 years ago

It what way can they not be stringified? Cause we have examples in the readme that show every single one of those functions being stringified...

jescalan commented 10 years ago

No, they can't. Compile client has stringifiable functions, compile does not. That's why the two are different. If they could both be stringified, they would not both need to exist. You can try compiling jade then stringifying it and see if it works out the other end. Spoiler, it doesn't -- but feel free to give it a shot anyway.

notslang commented 10 years ago

Ah, that makes sense: compileClient returns a string while the other compile functions return a function. We should change that in the readme, because right now we're showing all of the compile functions having toString called on their output... which wouldn't make sense for a string, and shouldn't be done for those functions.

jescalan commented 10 years ago

Oh yeah I didn't realize that was happening in the readme. Anyways I started work on implementing this change, it's in the 0.13.0 branch. Will try to finish up this week and push a new version.

notslang commented 10 years ago

Oh, btw: there's one other way we could do that interface for the source maps while maintaining some of the elegance of the old API. We could return an object that has a toString function, which returns whatever's in the result property. That way, if you do @jade.compile('p blah').done((res)-> console.log res) it would print out <p>blah</p> rather than an object.

It would still be a breaking change and would still require a major version bump, but it would be slightly "less breaking" and it would be a little easier to upgrade to the next version of accord. Also, I think that having the object be represented as the contents of the file is accurate because the resulting string really is what the job produced... the other stuff like sourcemaps is just metadata about that string.

notslang commented 10 years ago

Actually, now that I think about it, naming the property "result" doesn't make much sense either... since the return value of a compilation/render could be passed to another render function, that "result" of the first function would become the "input" of the 2nd function. I think I'm just going to call it "text"

jescalan / accord

Stylus Sourcemaps #52