chakra-core / ChakraCore

ChakraCore is an open source Javascript engine with a C API.
MIT License
9.12k stars 1.2k forks source link

Questions about JsBulitIn #6410

Open zenparsing opened 4 years ago

zenparsing commented 4 years ago

The JsBuiltIn system makes cross-platform development difficult, since:

Also, dealing with bytecode regeneration is a constant source of conflicts and annoyance.

So a couple of questions for the OG Chakra devs:

  1. What advantage does the JsBulitIn system give (as opposed to just keeping it in C++)?
  2. Why is there different byte code for x86/x64/jit/no-jit?

@pleath @boingoing @jackhorton

ppenzin commented 4 years ago

Also, what happens on Arm? The regen scripts does not seem to handle it.

rhuanjl commented 4 years ago

I can add a bit more information - though obviously the original chakra devs will have more.

Advantages:

  1. JsBuiltins have the potential of being inlined by the JIT, for performance this is particularly useful for methods that use callbacks and also for the arrray iterator to be inlined into for...of loops.

  2. JsBuiltins are easier to write for some purposes than the same in c++ I think this may be the reason why half of the INTL library is internally written in Javascript.

Different Bytecode The Bytecode changes when new direct fields are added or removed and when new opcodes are added or removed. I think NoJit ifdefs out some opcodes leading to the Jit/NoJit difference. I'm less sure on the x86 vs x64 difference. I don't think there's a x86 vs ARM difference - though I could be wrong.

Note on the regen scrips The regen scripts are super simple BUT what they're doing is passing a variety of parameters and the .js builtin files to ch.exe linked to a debug build of chakracore - the parameters to dump the bytecode aren't recognised by a non-debug build. Additionally to make the x86 version needs an x86 debug build of chakracore - which is only available on windows. So whilst the script could be rewritten to be cross platoform as long as the x86 bytecode is different to x64 there's a need for windows to be able to run it.

Options for improving dev experience

  1. Make the bytecode x86/x64 independent to reduce the number of versions AND enable regenerating it on macOS or Linux with a python script. As noted above I do not know what is different between x86 and x64 bytecode at the moment - is there an x64 only feature that impacts it?

  2. Create an online tool/bot that can do the bytecode regen - no change to CC source, just setup a tool (some kind of container running windows) we can upload to that will regen for us.

  3. Scrap the self hosted bytecode all together by doing one of the following: a. embedding it as unparsed JS instead - using JsToC or some equivalent: downside is the time taken to convert it to bytecode at runtime when it's needed. b. Using C++ implementations of these methods: downsides are i) potential performance hits AND ii) time taken to do this - particularly for intl this would be a lot of work

ppenzin commented 4 years ago

I might be wrong about this, but it sounds like the script is just invoking ch with a number of parameters that allow it to print a bytecode dump for the builtins (it puts it into a C++ header, but at the core it is just printing). Depending on how self-contained parts of the pipeline are and how much backend gets used in this case, it might be possible to link frontend and bytecode dumper into a separate shell executable that would only do the generation. Then the same libraries and the rest of the backend would be linked into ch. I am planning to do some experimentation on this, thought it might not end up being a quick fix.

rhuanjl commented 4 years ago

@ppenzin You're right that the script is invoking ch with a number of parameters BUT the work that ch then does needs the parser, the bytecode emitter and the bytecode dumping logic out of a debug build of chakracore.

If we could somehow make a build step that builds just those components and then produces the bytecode before building the rest of chakracore that would be amazing - notably if it was done every time CC was built you'd skip the need to store the bytecode in the source tree (reducing an enormous pile of conflicts) AND you'd eliminate the windows only issue as x86 bytecode would only be needed when doing an x86 build.

zenparsing commented 4 years ago

@ppenzin I was thinking along the same lines, but I might not be able to look into it for a few days. Let me know what you find!

jackhorton commented 4 years ago

I personally had briefly looked into doing bytecodegen during the build, and came to the conclusion that MSBuild makes it tricky, since you basically have to build chakracore.dll twice. Its almost certainly doable, though, and I know V8 already does something along those lines (though they are doing a much more complex version). I wouldnt be surprised if SpiderMonkey did that as well, since I know they selfhost a decent chunk of their stdlib in JS. Not sure about JSC.

I am fairly sure that you cant just split chakracore.dll into cc_frontend.dll and cc_backend.dll because bailouts/rejits/lazy parsing all require jumping from the backend back to the frontend.

As for why the bytecode is different, I forget exactly why but I think theres only a handful of opcodes that are different between x86 and x64, and then an even fewer number that were different between jit and nojit.

And finally, I dont think I would recommend moving from JS to C++ implementations if possible -- in addition to the increased speed from jitting and inlining (which is only applicable really when youre calling back out into user code, like the Array.prototype methods), JS builtins are easier to write 1) at all (in my opinion, compared to C++) and 2) securely -- you cant use-after-free in JS code :)

jackhorton commented 4 years ago

oh and for ARM64, we simply never released a public version of chakracore for ARM64. For chakrafull, which we did release ARM and ARM64 versions for, I think that the bytecode for ARM64 was the same as x64 and ARM was the same as x86?

rhuanjl commented 4 years ago

I am fairly sure that you cant just split chakracore.dll into cc_frontend.dll and cc_backend.dll because bailouts/rejits/lazy parsing all require jumping from the backend back to the frontend.

If the aim was to have a separate build target that just did parsing and bytecode emission to create this bytecode as a build step there may be less problems - it would just be a case of whether you could build that without building the rest of the library though it may be messy to excise it properly.

boingoing commented 4 years ago

The ARM and ARM64 builds use the same bytecode as x86 and x64, respectively. Yes.

As to why we have some builtins written in Javascript at all, this thread is correct. We originally wrote Intl.js because it was much quicker to develop in Javascript vs native c++ and the Javascript can be optimized better by the JIT. Later we added JsBuiltIns.js as a way to accelerate adding new library features such as the new Array builtins. This was also done to get better optimized by JIT - some Array builtins in particular can take advantage. The security aspects of writing Javascript code weren't as big a driver. We hoped by being able to write builtins in Javascript, the community would be better able to help us implement new features in the spec.

As an aside, I do have to point out that we've had to fix several security holes in the builtin Javascript code so it's not as much of a security boundary as we'd like. You can look at previously patched CVE rollups to see what kinds of issues we've had.

So, what should we do about regenerating bytecode? It was always a pain point for us Chakra developers and it wasn't helped very much by the build system we needed to use to build our internal binaries. In my opinion, of the options available the simplest one is to unify the bytecode between x86 and x64. We never attempted this but we had built these scripts and do not primarily develop on Mac so it wasn't a big enough motivator for us. Generating bytecode in the cloud seems doable as well but logistically painful. And rewriting all of the builtin JS into c++ is the least good - Intl is massive - even if the perf change wouldn't be big. That huge new c++ surface would also be ripe for security issues which are more or less ironed out in the existing Intl.js code.

ppenzin commented 4 years ago

I think the main goal should be to keep the bytecode up to date, which would is challenging whichever way we approach it, because bytecode can change both because of the changes to intinsics and changes to the compiler.

Currently the developer regenerates bytecode and check it in, which means this does not have to happen on every build, but on the other hand nothing prevents it from being out of date (or slightly out of date, that you won't notice it before checking in).

With automated regeneration it would get to run every time, but I think we can make the file change only if bytecode changed, which means the backend won't be relinked unless there are changes to bytecode. I think this approach is safer and more streamlined. As for overhead, let's try it and find out.

rhuanjl commented 4 years ago

Just to add a bit of info for anyone reading this who isn't aware, you can generate the bytecode headers on macos with the following commands (assuming you're in the root chakracore directoy and have built a testbuild of CC):

out/test/ch lib/Runtime/Library/JsBuiltIn/JsBuiltIn.js -LdChakraLib -GenerateLibraryByteCodeHeader >lib/Runtime/Library/JsBuiltin/JsBuiltIn.js.bc.64b.h

out/test/ch lib/Runtime/Library/InJavascript/Intl.js -Intl -GenerateLibraryByteCodeHeader >lib/Runtime/Library/InJavascript/Intl.js.bc.64b.h

So if you're developing something that triggers a bytecode change you can test it by building, running those commands then rebuilding before performing your test. (a similar command will work on linux, on windows it'll be ch.exe and the path will be slightly different as it has a different build output folder)

The remaining challenge is that those sets of bytecode will only work with a 64bit chakracore with the JIT enabled - nojit and 32bit versions of CC require different bytecode to be generated which can only be generated with a nojit build and a 32 bit build respectively.

rhuanjl commented 4 years ago

@boingoing do you know what gives rise to the 32bit/64bit difference - I've had a look for any relevant #ifdefs but none stand out - deleting the only 2 that looked vaguely relevant had no impact at all.

(I also note that the 32bit and 64bit bytecode files differ in length by only one line so whatever the difference is can't be that big.)

jackhorton commented 4 years ago

I believe that we had a PR validation check to make sure that the bytecode didnt need to be regenerated -- and if you changed the internal bytecode enough without regenerating the files, the engine would simply fail to load them. So, it was usually fairly hard to let the files get totally out of sync (i think it only really happened during simultaneous incompatible merges)

jackhorton commented 4 years ago

And for the difference, I think you would be able to use a more advanced diff tool like Beyond Compare to figure out which actual opcode(s) is different. Do note that Intl and JsBuiltIns may not exercise all available opcodes, so the only for-sure way to tell is to check opcodes.h or bytecodegenerator and look for #ifdefs or other platform checks.

boingoing commented 4 years ago

@rhuanjl I'm not sure off the top of my head what the differences all are but we generate different bytecode in some cases for sure. You can get a better look at the bytecode using the -dump:bytecode switch and passing it to ch. I don't know if I'd try that with Intl.js as there's so very much code in there. You'll probably be able to see differences in some trivial code.

pleath commented 4 years ago

At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated.

-- Paul

From: Taylor Woll notifications@github.com Sent: Tuesday, April 14, 2020 10:29 AM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)

@rhuanjlhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Frhuanjl&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299173704&sdata=E6RBWlPYn7sSS4k8f8jgHu0Y5KIVDVT0tfHEaZG4Jwo%3D&reserved=0 I'm not sure off the top of my head what the differences all are but we generate different bytecode in some cases for sure. You can get a better look at the bytecode using the -dump:bytecode switch and passing it to ch. I don't know if I'd try that with Intl.js as there's so very much code in there. You'll probably be able to see differences in some trivial code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613575706&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299173704&sdata=o9HSfRW%2Fc3AO0frRQ4NjZqh6MHAzyLf5AWg9ToyQs9w%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRE4SHYXAOPUV4ABZYLRMSMM7ANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299183697&sdata=PLc1y5iQ4t1cYDM2As1S%2BAXUGu0g7o84Vc98M49wwgQ%3D&reserved=0.

rhuanjl commented 4 years ago

At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated.

Having a look... Intl.js’s bytecode has a small 32bit vs 64bit diff - the 32bit one uses need 19 extra bytes plus a small number of different selections. My current guess is that these may be extra commands for handling numbers when FLOAT_VAR is disabled (as it is on 32bit)

Conversely the only difference in JsBuiltin.js’s bytecode for 32bit vs 64bit is in the header at the start where it specifies the architecture.

pleath commented 4 years ago

Thanks for looking. Nothing about number representation should be expressed in the byte code, so we should definitely kill that if it is indeed what the delta is about.

-- Paul

From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 10:46 AM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)

At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated. …

Having a look... Intl.js’s bytecode has a small 32bit be 64bit - the 32bit one uses need 19 extra bytes plus a small number of different selections. My current guess is that these may be extra commands for handling numbers when FLOAT_VAR is disabled (as it is on 32bit)

Conversely the only difference in JsBuiltin.js’s bytecode for 32bit vs 64bit is in the header at the start where it specifies the architecture.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613584223&data=02%7C01%7Cpleath%40microsoft.com%7Ca09f81a5bb134e987ee508d7e09bb4c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224831674403804&sdata=j5vJqLxYmonnu7OP0jXgwzAQ4mtOLcGiDhgk9mQHXdg%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRC37MHTXDWV345MBDDRMSONZANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7Ca09f81a5bb134e987ee508d7e09bb4c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224831674403804&sdata=9H%2F%2B3sKgmmIpxDzMQvumzNlwiviUiEwp68eBj70qGsI%3D&reserved=0.

rhuanjl commented 4 years ago

Done a full diff - was simpler than I thought, Intl.js bytecode with x86 vs x64 differences are:

  1. two large numerical constants to be loaded with LdC_A_I4 on x64 vs LdC_A_R8 on x86
  2. String concatenation done with ADD_A on x64 vs NewConcatStrMulti on x86, snippet:

x64:

 Line 2414: const language = '\\b(?:'     +                                         // language      =
  Col    9: ^
    013a   Ld_A                 R56  R18 
    013d   Conv_Str             R57  R37 
    0142   Add_A                R56  R56  R57 
    0146   Add_A                R56  R56  R24 
    014a   Add_A                R56  R10  R56 
    014e   Ld_A                 R57  R25 
    0151   Conv_Str             R58  R49 
    0156   Add_A                R57  R57  R58 
    015a   Add_A                R57  R57  R26 
    015e   Add_A                R56  R56  R57 
    0162   Ld_A                 R57  R5 
    0165   Conv_Str             R58  R37 
    016a   Add_A                R57  R57  R58 
    016e   Add_A                R57  R57  R27 
    0172   Add_A                R56  R56  R57 
    0176   Add_A                R56  R56  R11 
    017a   Ld_A                 R50  R56 

x86

  Line 2414: const language = '\\b(?:'     +                                         // language      =
  Col    9: ^
    013a   Ld_A                 R57  R18 
    013d   Conv_Str             R58  R37 
    0142   Add_A                R57  R57  R58 
    0146   Add_A                R57  R57  R24 
    014a   NewConcatStrMulti    R56  R10  R57  int:5 
    014f   Ld_A                 R57  R25 
    0152   Conv_Str             R58  R49 
    0157   Add_A                R57  R57  R58 
    015b   Add_A                R57  R57  R26 
    015f   Ld_A                 R58  R5 
    0162   Conv_Str             R59  R37 
    0167   Add_A                R58  R58  R59 
    016b   Add_A                R58  R58  R27 
    016f   SetConcatStrMultiItem2 R56  R57  R58  int:2 
    0174   SetConcatStrMultiItem R56  R11  int:4 
    0178   Ld_A                 R50  R56 

Both of this differences happen twice - once in the windows_glob seciton and once in the icu section. The overall diff looks larger due to an alignment difference caused by the string concatenation vs ADD_A diff.

pleath commented 4 years ago

Great. Thanks. I’ll look and see. I’m really mystified by the string concat difference.

-- Paul

From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 2:43 PM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)

Done a full diff - was simpler than I thought, Intl.js bytecode with x86 vs x64 differences are:

  1. two large numerical constants to be loaded with LdC_A_I4 on x64 vs LdC_A_R8 on x86
  2. String concatenation done with ADD_A on x64 vs NewConcatStrMulti on x86, snippet:

x64:

Line 2414: const language = '\b(?:' + // language =

Col 9: ^

013a   Ld_A                 R56  R18

013d   Conv_Str             R57  R37

0142   Add_A                R56  R56  R57

0146   Add_A                R56  R56  R24

014a   Add_A                R56  R10  R56

014e   Ld_A                 R57  R25

0151   Conv_Str             R58  R49

0156   Add_A                R57  R57  R58

015a   Add_A                R57  R57  R26

015e   Add_A                R56  R56  R57

0162   Ld_A                 R57  R5

0165   Conv_Str             R58  R37

016a   Add_A                R57  R57  R58

016e   Add_A                R57  R57  R27

0172   Add_A                R56  R56  R57

0176   Add_A                R56  R56  R11

017a   Ld_A                 R50  R56

x86

Line 2414: const language = '\b(?:' + // language =

Col 9: ^

013a   Ld_A                 R57  R18

013d   Conv_Str             R58  R37

0142   Add_A                R57  R57  R58

0146   Add_A                R57  R57  R24

014a   NewConcatStrMulti    R56  R10  R57  int:5

014f   Ld_A                 R57  R25

0152   Conv_Str             R58  R49

0157   Add_A                R57  R57  R58

015b   Add_A                R57  R57  R26

015f   Ld_A                 R58  R5

0162   Conv_Str             R59  R37

0167   Add_A                R58  R58  R59

016b   Add_A                R58  R58  R27

016f   SetConcatStrMultiItem2 R56  R57  R58  int:2

0174   SetConcatStrMultiItem R56  R11  int:4

0178   Ld_A                 R50  R56

Both of this differences happen twice - once in the windows_glob seciton and once in the icu section. The overall diff looks larger due to an alignment difference caused by the string concatenation vs ADD_A diff.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613696339&data=02%7C01%7Cpleath%40microsoft.com%7C745d18fd728e4840415408d7e0bcd40d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224973930816793&sdata=IpZRkto5H7DG9gJHp8yrO4Z3Y%2FOXCcqfaw8cbuJEZ6U%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRHXFPZSLMDXDDMELNTRMTKG7ANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7C745d18fd728e4840415408d7e0bcd40d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224973930816793&sdata=4Sr0n2zUgrpXP%2BkiE1g4%2B4gXjUSrULiyWKyhV49UfiM%3D&reserved=0.

rhuanjl commented 4 years ago

Kept looking - the bytecode emitter does a check to see whether string concat is more efficient than ADD_A using this method that involves comparing to the size of a pointer: https://github.com/microsoft/ChakraCore/blob/6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764/lib/Runtime/Library/CompoundString.cpp#L173-L184

As for the constants being LD_C_I4 vs LD_A, I'm unsure where the logic is, the numbers are: -2,146,828,260 and -2,146,828,281 which both look small enough to fit inside an INT32 - unless is it that tagged ints on 32bit CC are actually int28s or equivalent?

pleath commented 4 years ago

Well, if the byte code size difference isn’t too bad, we could just always emit the string concat byte code and let the interpreter and JIT decide whether to honor it or treat it as an Add_A. The difference in the constants is tougher to get rid of, though. Tagged int is an int31 on 32-bit but int32 on 64-bit. The values in the constant table (which is what you’re seeing in the dump, a table embedded in the FunctionBody rather than actual load-constant ops, but still part of the serialized byte code) have to be valid Vars. I suppose we could encode raw int32’s in the table and indicate somehow which ones those are, but that seems like a lot of trouble to go to for this purpose.

-- Paul

From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 3:03 PM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)

Kept looking - the bytecode emitter does a check to see whether string concat is more efficient than ADD_A using this method that involves comparing to the size of a pointer: https://github.com/microsoft/ChakraCore/blob/6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764/lib/Runtime/Library/CompoundString.cpp#L173-L184https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fblob%2F6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764%2Flib%2FRuntime%2FLibrary%2FCompoundString.cpp%23L173-L184&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716460287&sdata=9pIxpWJwWhlzyZIXU1izmePw%2FNoghhNDRdfhsfxjia0%3D&reserved=0

As for the constants being LD_C_I4 vs LD_A, I'm unsure where the logic is, the numbers are: -2,146,828,260 and -2,146,828,281 which both look small enough to fit inside an INT32 - unless is it that tagged ints on 32bit CC are actually int28s or equivalent?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613703720&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716470282&sdata=fcIFQgEWwr6PoHdlnTAw1Ihm6J8xgf99%2BHIoVeTM%2FCY%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRCPOB2RJV4M4M6MPFTRMTMQTANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716470282&sdata=1JOydx5OU1xfhNAalIvAZfO%2Bkjmg6roZVo2apTVuPYA%3D&reserved=0.

rhuanjl commented 4 years ago

I'm thinking if the aim is just to remove the difference in the intl bytecode - I can edit intl.js so these differences don't occur and rely on CI telling us if we've ever reintroduced them (or any other differences).

  1. For the string concat - can pre-combine in the JS, the concat is only needed to fit a comment in the middle, can move the comment

  2. For the Ints - this is used for comparing against error numbers to detect OOM and SE exceptions so it's not on a fast path - could store these as strings and use toNumber on them to take the Ints out of the bytecode.

May be a lot simpler than editing the bytecode emitter for this.

ppenzin commented 4 years ago

I like the solution in #6425. I think eliminating the checked-in header would be even better, though I am not sure how to integrate that into MSBuild.

I personally had briefly looked into doing bytecodegen during the build, and came to the conclusion that MSBuild makes it tricky, since you basically have to build chakracore.dll twice.

@jackhorton, just to entertain a possibility, what would you think about CMake generating MSBuild files?

I've used that for LLVM, but switched to Ninja as it close to what I am used to.

rhuanjl commented 4 years ago

Well I've merged #6425 so windows remains the same but for now devs working on macOS and Linux can regen the bytecode if they need to. Definitely would be better to find a way to not need the bytecode to be checked in at all - but this works for now.

jackhorton commented 4 years ago

I think we had experimented with it in the past, but the msbuild files we used were highly artisinal (and, imo, fairly high-quality) due to our team's familiarity with the vcxproj system and our requirements for building ChakraFull in the windows build system. As that is a non-goal going forward, and since most people are more comfortable with CMake than vcxproj, using only CMake going forward sounds reasonable to me. For what its worth, you can also have CMake generate ninja files on Windows, if I remember correctly.