Open billrawlinson opened 9 years ago
Is this with concat or gulp itself?
it seems like it is concat to me considering the characters that are munged are interleaved (every other file). I figured I'd post the problem here first and see if you guys could see it and, possibly, confirm or reject if it is with gulp-concat.
@billrawlinson Can you try just piping src to dest a bunch of times and see if that causes the issue as well?
Sure I'll try Monday. I'm on the road now. If anyone else wants to know sooner they can pull the demo project and try.
I figure the problem is either in the file read or Concat as the problem manifests in the middle of the concatted result which should rule out the write operation
On Fri, Jul 10, 2015, 15:28 contra notifications@github.com wrote:
@billrawlinson https://github.com/billrawlinson Can you try just piping src to dest a bunch of times and see if that causes the issue as well?
— Reply to this email directly or view it on GitHub https://github.com/wearefractal/gulp-concat/issues/101#issuecomment-120502422 .
So I ran the tests where I just pipe in the files to dest and nothing funky happens to the files in the process.
I've updated the test demo project to where it does both.
If you want to run the tests to see the results just pull the project and give it a run. Each test now puts its results in a folder titled "results#' where # is the number of the test being run.
I'm guessing it has something to do with buffer conversions in concat-with-sourcemaps:
Probably mixing a bunch of encodings together using node's Buffer module is causing unexpected results.
In test example 2 (utf16le) and 3 (utf16be) the encodings are all the same. Test 1 and 4 with mixed encodings ends up with better results (though still broken). Test 5,utf8,is the only one that has the correct results.
On Mon, Jul 13, 2015, 18:13 contra notifications@github.com wrote:
I'm guessing it has something to do with buffer conversions in concat-with-sourcemaps:
- https://github.com/floridoo/concat-with-sourcemaps/blob/master/index.js#L109
https://github.com/floridoo/concat-with-sourcemaps/blob/master/index.js#L43-L46
https://github.com/floridoo/concat-with-sourcemaps/blob/master/index.js#L15-L18
Probably mixing a bunch of encodings together using node's Buffer module is causing unexpected results.
— Reply to this email directly or view it on GitHub https://github.com/wearefractal/gulp-concat/issues/101#issuecomment-121077591 .
@billrawlinson I mean that the separator is treated as UTF-8, so combining that with some UTF-16 buffers might be yielding weird results
ah, that makes perfect sense.
I assume, due to the nature of gulp pipes that concat has no way of knowing the encoding of the various buffers coming in to it from src?
you are correct; it is the separator character that is causing the problem. I set up the test like follows:
function runConcatTest(d){
var testResults = gulp.src(d.sources)
.pipe(concat(d.outfile, { newLine: '' }))
.pipe(gulp.dest(d.outpath));
testResults.on('data', printToConsole);
}
Where I basically blanked out the newLine character and the test 2 and 3 both work perfectly while test1 and 4 are all mucked up. If I don't override the newline it is broken as before.
Maybe as a temporary solution just the readme could be updated to let people know if they joining UTF16 files that they should put their own newline at the end of the files and then override the join character to be nothing.
UPDATE: I updated the demo project to show the working scenario with test 2 using an empty string as a the separator.
@billrawlinson Hmm trying to think up a solution here, going to dig into the buffer docs and see if I can figure something out
https://nodejs.org/api/buffer.html#buffer_class_method_buffer_isencoding_encoding
could emit a warning if the users mixes encodings (assuming we can't figure out a way to make it work)
I played around with this for a bit and it stumped me, @billrawlinson did you figure anything out?
I did not. I just resorted to not using UTF 16 :-1:
Have run into the same issue and it turns out the files that end up munged are UTF16
When concating files which are UTF 16 Little Endian (unicode) every other file gets munged a bit.
When concatenating files which are UTF 16 Big Endian the same result happens.
If you alternate files where the first is UTF16LE and the second is UTF16BE then just the very end of the second file gets munged.
I have set up a demo project that illustrates this and has a bunch of notes that explain why I even tried these things. I don't know for certain the problem is in gulp-concat (it could be in gulp itself in
gulp.src()
. )https://github.com/finalcut/gulp-concat-bug