gulpjs / vinyl

Virtual file format.
MIT License
1.28k stars 107 forks source link

fix remote-path normalization (#127) #128

Closed bezoerb closed 7 years ago

yocontra commented 7 years ago

Didn't look yet, but we should have tests for:

Will review later today

phated commented 7 years ago

Definitely needs all of the tests stated by @contra

I'd also like to wait on feedback from @darsain @jonschlinkert and @doowb

jonschlinkert commented 7 years ago

Hmm, since path handling in vinyl is completely geared towards file system paths, maybe this is an opportunity to rethink the URL handling so that it's more idiomatic for URLs.

For example:

Maybe we should add support for file.url as an alternative to file.cwd? I'm not sure what the best approach is, but my gut reaction is that either a) URL normalization doesn't really belong here, so maybe we should create an adapter like vinyl-url or something (although this isn't really about adapters, it's about the actual virtual object being created, so even if there was an adapter for handling requests, it still wouldn't address how the URL segments should map over to file path segments) or b) if it does belong here, we should do it the right way and add support for all of the URL segments that are created by url.parse() (@contra is touching on this I think). Just my 2c.

edit: idea... we could just create conventions for how the URL segments created by url.parse() map over to the existing path segments. Then, if a user defines {url: true} (and/or we can use the regex check to determine this) when creating a new file, we can parse it as a URL into those segments. So the result would be a vinyl file object that has the same path segments and conventions as existing vinyl files, but using a URL and normalization logic that is idiomatic for URLs.

edit edit: regardless of the approach, it seems like the code in the pr doesn't make sense for this lib. If we're going to go to these measures for normalization, maybe it's better to allow a normalize function to be passed so that users can deal with edge cases. The normalize-url module does some questionable things on urls.

darsain commented 7 years ago

This feels like an entrance to a very deep and complicated rabbit hole that will lead to a lot of edge cases and weird behavior as we try to unify 2 different resource pointing concepts into one ambiguous monster.

For start, how would we handle search and hash parts of the url? or ports, or username:password pairs, or tons and tons of other URI stuff... the RFC 3987 is a very nasty beast to tame: http://www.faqs.org/rfcs/rfc3987.html

I think it would be simpler to create a separate "virtual online resource format" project, differentiate between them with isVinyl/isUril (example name) or something like that, and have a separate code paths for each, since that will be necessary anyways as you need different APIs and tools to deal with files and network requests.

yocontra commented 7 years ago

It worked fine prior to path normalization being added - vinyl-http, vinyl-ftp, etc. are all things that exist (now broken due to normalization addition)

Worst case we can revert the normalization and do it a layer up

jonschlinkert commented 7 years ago

This feels like an entrance to a very deep and complicated rabbit hole that will lead to a lot of edge cases and weird behavior as we try to unify 2 different resource pointing concepts into one ambiguous monster.

yeah, this sums up what I was thinking as well.

phated commented 7 years ago

@contra it was a major bump because it changed path semantics

bezoerb commented 7 years ago

it was a major bump because it changed path semantics

That's why i asked if the dropped support for remote paths was the intended behaviour and vinyl should focus on file system paths. What was the the goal in introducing normalization? What's the advantage in making this lib more strictly focused on filesystems?

edit edit: regardless of the approach, it seems like the code in the pr doesn't make sense for this lib. If we're going to go to these measures for normalization, maybe it's better to allow a normalize function to be passed so that users can deal with edge cases.

I personally like the idea. This would address all concerns regarding this pr and let's the user opt-in to the opinionated normalization process which in my case would be totally fine to fix my issue.

Maybe i am in the wrong position to make decisions here but i would be fine to propose another PR for the normalization overloading ;)

yocontra commented 7 years ago

If the user is passing in a normalize function then why wouldn't they normalize the path before passing it in? Not a fan of that. I don't see why it would be a huge deal to mandate people normalize before passing it in (previous behavior, better docs) and leave it out of the base level library. What was the normalization fixing? IIRC every adapter does it's own normalization.

phated commented 7 years ago

I completely disagree. However, I've actually thought of an elegant way to handle both. I'll write up my thoughts in the morning.

phated commented 7 years ago

I actually believe this problem can and should be solved by my idea of an Enhanced Stat Object (#105). Usage would look something like:

var file = new Vinyl({
  path: 'http://google.com',
  stat: {
    isRemote: true
  }
})

The isRemote flag in the Enhanced Stat Object would indicate that a Vinyl resource is being constructed with a remote path and would disable normalization (or do URL normalization). Also, Enhanced Stat Objects would allow for property changes and track history, a non-remote file could be converted to a remote file and users would know where it happened in a pipeline.

darsain commented 7 years ago

@phated yup, that'd be a pretty good solution

bezoerb commented 7 years ago

Closing this PR because it won't make it into master. Thanks for the discussion/feedback. In my opinion the enhanced stat object solution entails more maintenance work as there will always be edge-cases which need to be fixed and it takes away the control from the user.

I would go for a) "drop the normalization + enhance documentation" or b) something like https://github.com/gulpjs/vinyl/pull/130.

But no matter what option you'll choose. I'm sure it will work out ;)