Open zenparsing opened 4 years ago
Also, what happens on Arm? The regen scripts does not seem to handle it.
I can add a bit more information - though obviously the original chakra devs will have more.
Advantages:
JsBuiltins have the potential of being inlined by the JIT, for performance this is particularly useful for methods that use callbacks and also for the arrray iterator to be inlined into for...of loops.
JsBuiltins are easier to write for some purposes than the same in c++ I think this may be the reason why half of the INTL library is internally written in Javascript.
Different Bytecode The Bytecode changes when new direct fields are added or removed and when new opcodes are added or removed. I think NoJit ifdefs out some opcodes leading to the Jit/NoJit difference. I'm less sure on the x86 vs x64 difference. I don't think there's a x86 vs ARM difference - though I could be wrong.
Note on the regen scrips The regen scripts are super simple BUT what they're doing is passing a variety of parameters and the .js builtin files to ch.exe linked to a debug build of chakracore - the parameters to dump the bytecode aren't recognised by a non-debug build. Additionally to make the x86 version needs an x86 debug build of chakracore - which is only available on windows. So whilst the script could be rewritten to be cross platoform as long as the x86 bytecode is different to x64 there's a need for windows to be able to run it.
Options for improving dev experience
Make the bytecode x86/x64 independent to reduce the number of versions AND enable regenerating it on macOS or Linux with a python script. As noted above I do not know what is different between x86 and x64 bytecode at the moment - is there an x64 only feature that impacts it?
Create an online tool/bot that can do the bytecode regen - no change to CC source, just setup a tool (some kind of container running windows) we can upload to that will regen for us.
Scrap the self hosted bytecode all together by doing one of the following: a. embedding it as unparsed JS instead - using JsToC or some equivalent: downside is the time taken to convert it to bytecode at runtime when it's needed. b. Using C++ implementations of these methods: downsides are i) potential performance hits AND ii) time taken to do this - particularly for intl this would be a lot of work
I might be wrong about this, but it sounds like the script is just invoking ch
with a number of parameters that allow it to print a bytecode dump for the builtins (it puts it into a C++ header, but at the core it is just printing). Depending on how self-contained parts of the pipeline are and how much backend gets used in this case, it might be possible to link frontend and bytecode dumper into a separate shell executable that would only do the generation. Then the same libraries and the rest of the backend would be linked into ch
. I am planning to do some experimentation on this, thought it might not end up being a quick fix.
@ppenzin You're right that the script is invoking ch with a number of parameters BUT the work that ch then does needs the parser, the bytecode emitter and the bytecode dumping logic out of a debug build of chakracore.
If we could somehow make a build step that builds just those components and then produces the bytecode before building the rest of chakracore that would be amazing - notably if it was done every time CC was built you'd skip the need to store the bytecode in the source tree (reducing an enormous pile of conflicts) AND you'd eliminate the windows only issue as x86 bytecode would only be needed when doing an x86 build.
@ppenzin I was thinking along the same lines, but I might not be able to look into it for a few days. Let me know what you find!
I personally had briefly looked into doing bytecodegen during the build, and came to the conclusion that MSBuild makes it tricky, since you basically have to build chakracore.dll twice. Its almost certainly doable, though, and I know V8 already does something along those lines (though they are doing a much more complex version). I wouldnt be surprised if SpiderMonkey did that as well, since I know they selfhost a decent chunk of their stdlib in JS. Not sure about JSC.
I am fairly sure that you cant just split chakracore.dll into cc_frontend.dll and cc_backend.dll because bailouts/rejits/lazy parsing all require jumping from the backend back to the frontend.
As for why the bytecode is different, I forget exactly why but I think theres only a handful of opcodes that are different between x86 and x64, and then an even fewer number that were different between jit and nojit.
And finally, I dont think I would recommend moving from JS to C++ implementations if possible -- in addition to the increased speed from jitting and inlining (which is only applicable really when youre calling back out into user code, like the Array.prototype methods), JS builtins are easier to write 1) at all (in my opinion, compared to C++) and 2) securely -- you cant use-after-free in JS code :)
oh and for ARM64, we simply never released a public version of chakracore for ARM64. For chakrafull, which we did release ARM and ARM64 versions for, I think that the bytecode for ARM64 was the same as x64 and ARM was the same as x86?
I am fairly sure that you cant just split chakracore.dll into cc_frontend.dll and cc_backend.dll because bailouts/rejits/lazy parsing all require jumping from the backend back to the frontend.
If the aim was to have a separate build target that just did parsing and bytecode emission to create this bytecode as a build step there may be less problems - it would just be a case of whether you could build that without building the rest of the library though it may be messy to excise it properly.
The ARM and ARM64 builds use the same bytecode as x86 and x64, respectively. Yes.
As to why we have some builtins written in Javascript at all, this thread is correct. We originally wrote Intl.js because it was much quicker to develop in Javascript vs native c++ and the Javascript can be optimized better by the JIT. Later we added JsBuiltIns.js as a way to accelerate adding new library features such as the new Array builtins. This was also done to get better optimized by JIT - some Array builtins in particular can take advantage. The security aspects of writing Javascript code weren't as big a driver. We hoped by being able to write builtins in Javascript, the community would be better able to help us implement new features in the spec.
As an aside, I do have to point out that we've had to fix several security holes in the builtin Javascript code so it's not as much of a security boundary as we'd like. You can look at previously patched CVE rollups to see what kinds of issues we've had.
So, what should we do about regenerating bytecode? It was always a pain point for us Chakra developers and it wasn't helped very much by the build system we needed to use to build our internal binaries. In my opinion, of the options available the simplest one is to unify the bytecode between x86 and x64. We never attempted this but we had built these scripts and do not primarily develop on Mac so it wasn't a big enough motivator for us. Generating bytecode in the cloud seems doable as well but logistically painful. And rewriting all of the builtin JS into c++ is the least good - Intl is massive - even if the perf change wouldn't be big. That huge new c++ surface would also be ripe for security issues which are more or less ironed out in the existing Intl.js code.
I think the main goal should be to keep the bytecode up to date, which would is challenging whichever way we approach it, because bytecode can change both because of the changes to intinsics and changes to the compiler.
Currently the developer regenerates bytecode and check it in, which means this does not have to happen on every build, but on the other hand nothing prevents it from being out of date (or slightly out of date, that you won't notice it before checking in).
With automated regeneration it would get to run every time, but I think we can make the file change only if bytecode changed, which means the backend won't be relinked unless there are changes to bytecode. I think this approach is safer and more streamlined. As for overhead, let's try it and find out.
Just to add a bit of info for anyone reading this who isn't aware, you can generate the bytecode headers on macos with the following commands (assuming you're in the root chakracore directoy and have built a testbuild of CC):
out/test/ch lib/Runtime/Library/JsBuiltIn/JsBuiltIn.js -LdChakraLib -GenerateLibraryByteCodeHeader >lib/Runtime/Library/JsBuiltin/JsBuiltIn.js.bc.64b.h
out/test/ch lib/Runtime/Library/InJavascript/Intl.js -Intl -GenerateLibraryByteCodeHeader >lib/Runtime/Library/InJavascript/Intl.js.bc.64b.h
So if you're developing something that triggers a bytecode change you can test it by building, running those commands then rebuilding before performing your test. (a similar command will work on linux, on windows it'll be ch.exe and the path will be slightly different as it has a different build output folder)
The remaining challenge is that those sets of bytecode will only work with a 64bit chakracore with the JIT enabled - nojit and 32bit versions of CC require different bytecode to be generated which can only be generated with a nojit build and a 32 bit build respectively.
@boingoing do you know what gives rise to the 32bit/64bit difference - I've had a look for any relevant #ifdefs but none stand out - deleting the only 2 that looked vaguely relevant had no impact at all.
(I also note that the 32bit and 64bit bytecode files differ in length by only one line so whatever the difference is can't be that big.)
I believe that we had a PR validation check to make sure that the bytecode didnt need to be regenerated -- and if you changed the internal bytecode enough without regenerating the files, the engine would simply fail to load them. So, it was usually fairly hard to let the files get totally out of sync (i think it only really happened during simultaneous incompatible merges)
And for the difference, I think you would be able to use a more advanced diff tool like Beyond Compare to figure out which actual opcode(s) is different. Do note that Intl and JsBuiltIns may not exercise all available opcodes, so the only for-sure way to tell is to check opcodes.h or bytecodegenerator and look for #ifdefs or other platform checks.
@rhuanjl I'm not sure off the top of my head what the differences all are but we generate different bytecode in some cases for sure. You can get a better look at the bytecode using the -dump:bytecode
switch and passing it to ch
. I don't know if I'd try that with Intl.js as there's so very much code in there. You'll probably be able to see differences in some trivial code.
At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated.
-- Paul
From: Taylor Woll notifications@github.com Sent: Tuesday, April 14, 2020 10:29 AM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)
@rhuanjlhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Frhuanjl&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299173704&sdata=E6RBWlPYn7sSS4k8f8jgHu0Y5KIVDVT0tfHEaZG4Jwo%3D&reserved=0 I'm not sure off the top of my head what the differences all are but we generate different bytecode in some cases for sure. You can get a better look at the bytecode using the -dump:bytecode switch and passing it to ch. I don't know if I'd try that with Intl.js as there's so very much code in there. You'll probably be able to see differences in some trivial code.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613575706&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299173704&sdata=o9HSfRW%2Fc3AO0frRQ4NjZqh6MHAzyLf5AWg9ToyQs9w%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRE4SHYXAOPUV4ABZYLRMSMM7ANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7C2922dd4ca2b5441a8cce08d7e0994ac2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224821299183697&sdata=PLc1y5iQ4t1cYDM2As1S%2BAXUGu0g7o84Vc98M49wwgQ%3D&reserved=0.
At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated. …
Having a look... Intl.js’s bytecode has a small 32bit vs 64bit diff - the 32bit one uses need 19 extra bytes plus a small number of different selections. My current guess is that these may be extra commands for handling numbers when FLOAT_VAR is disabled (as it is on 32bit)
Conversely the only difference in JsBuiltin.js’s bytecode for 32bit vs 64bit is in the header at the start where it specifies the architecture.
Thanks for looking. Nothing about number representation should be expressed in the byte code, so we should definitely kill that if it is indeed what the delta is about.
-- Paul
From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 10:46 AM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)
At one time there was no difference. Early on we went to a lot of trouble to make sure of that. It would be interesting to know exactly what the difference is, because I suspect it could be eliminated. …
Having a look... Intl.js’s bytecode has a small 32bit be 64bit - the 32bit one uses need 19 extra bytes plus a small number of different selections. My current guess is that these may be extra commands for handling numbers when FLOAT_VAR is disabled (as it is on 32bit)
Conversely the only difference in JsBuiltin.js’s bytecode for 32bit vs 64bit is in the header at the start where it specifies the architecture.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613584223&data=02%7C01%7Cpleath%40microsoft.com%7Ca09f81a5bb134e987ee508d7e09bb4c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224831674403804&sdata=j5vJqLxYmonnu7OP0jXgwzAQ4mtOLcGiDhgk9mQHXdg%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRC37MHTXDWV345MBDDRMSONZANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7Ca09f81a5bb134e987ee508d7e09bb4c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224831674403804&sdata=9H%2F%2B3sKgmmIpxDzMQvumzNlwiviUiEwp68eBj70qGsI%3D&reserved=0.
Done a full diff - was simpler than I thought, Intl.js bytecode with x86 vs x64 differences are:
x64:
Line 2414: const language = '\\b(?:' + // language =
Col 9: ^
013a Ld_A R56 R18
013d Conv_Str R57 R37
0142 Add_A R56 R56 R57
0146 Add_A R56 R56 R24
014a Add_A R56 R10 R56
014e Ld_A R57 R25
0151 Conv_Str R58 R49
0156 Add_A R57 R57 R58
015a Add_A R57 R57 R26
015e Add_A R56 R56 R57
0162 Ld_A R57 R5
0165 Conv_Str R58 R37
016a Add_A R57 R57 R58
016e Add_A R57 R57 R27
0172 Add_A R56 R56 R57
0176 Add_A R56 R56 R11
017a Ld_A R50 R56
x86
Line 2414: const language = '\\b(?:' + // language =
Col 9: ^
013a Ld_A R57 R18
013d Conv_Str R58 R37
0142 Add_A R57 R57 R58
0146 Add_A R57 R57 R24
014a NewConcatStrMulti R56 R10 R57 int:5
014f Ld_A R57 R25
0152 Conv_Str R58 R49
0157 Add_A R57 R57 R58
015b Add_A R57 R57 R26
015f Ld_A R58 R5
0162 Conv_Str R59 R37
0167 Add_A R58 R58 R59
016b Add_A R58 R58 R27
016f SetConcatStrMultiItem2 R56 R57 R58 int:2
0174 SetConcatStrMultiItem R56 R11 int:4
0178 Ld_A R50 R56
Both of this differences happen twice - once in the windows_glob seciton and once in the icu section. The overall diff looks larger due to an alignment difference caused by the string concatenation vs ADD_A diff.
Great. Thanks. I’ll look and see. I’m really mystified by the string concat difference.
-- Paul
From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 2:43 PM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)
Done a full diff - was simpler than I thought, Intl.js bytecode with x86 vs x64 differences are:
x64:
Line 2414: const language = '\b(?:' + // language =
Col 9: ^
013a Ld_A R56 R18
013d Conv_Str R57 R37
0142 Add_A R56 R56 R57
0146 Add_A R56 R56 R24
014a Add_A R56 R10 R56
014e Ld_A R57 R25
0151 Conv_Str R58 R49
0156 Add_A R57 R57 R58
015a Add_A R57 R57 R26
015e Add_A R56 R56 R57
0162 Ld_A R57 R5
0165 Conv_Str R58 R37
016a Add_A R57 R57 R58
016e Add_A R57 R57 R27
0172 Add_A R56 R56 R57
0176 Add_A R56 R56 R11
017a Ld_A R50 R56
x86
Line 2414: const language = '\b(?:' + // language =
Col 9: ^
013a Ld_A R57 R18
013d Conv_Str R58 R37
0142 Add_A R57 R57 R58
0146 Add_A R57 R57 R24
014a NewConcatStrMulti R56 R10 R57 int:5
014f Ld_A R57 R25
0152 Conv_Str R58 R49
0157 Add_A R57 R57 R58
015b Add_A R57 R57 R26
015f Ld_A R58 R5
0162 Conv_Str R59 R37
0167 Add_A R58 R58 R59
016b Add_A R58 R58 R27
016f SetConcatStrMultiItem2 R56 R57 R58 int:2
0174 SetConcatStrMultiItem R56 R11 int:4
0178 Ld_A R50 R56
Both of this differences happen twice - once in the windows_glob seciton and once in the icu section. The overall diff looks larger due to an alignment difference caused by the string concatenation vs ADD_A diff.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613696339&data=02%7C01%7Cpleath%40microsoft.com%7C745d18fd728e4840415408d7e0bcd40d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224973930816793&sdata=IpZRkto5H7DG9gJHp8yrO4Z3Y%2FOXCcqfaw8cbuJEZ6U%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRHXFPZSLMDXDDMELNTRMTKG7ANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7C745d18fd728e4840415408d7e0bcd40d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224973930816793&sdata=4Sr0n2zUgrpXP%2BkiE1g4%2B4gXjUSrULiyWKyhV49UfiM%3D&reserved=0.
Kept looking - the bytecode emitter does a check to see whether string concat is more efficient than ADD_A using this method that involves comparing to the size of a pointer: https://github.com/microsoft/ChakraCore/blob/6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764/lib/Runtime/Library/CompoundString.cpp#L173-L184
As for the constants being LD_C_I4 vs LD_A, I'm unsure where the logic is, the numbers are: -2,146,828,260 and -2,146,828,281 which both look small enough to fit inside an INT32 - unless is it that tagged ints on 32bit CC are actually int28s or equivalent?
Well, if the byte code size difference isn’t too bad, we could just always emit the string concat byte code and let the interpreter and JIT decide whether to honor it or treat it as an Add_A. The difference in the constants is tougher to get rid of, though. Tagged int is an int31 on 32-bit but int32 on 64-bit. The values in the constant table (which is what you’re seeing in the dump, a table embedded in the FunctionBody rather than actual load-constant ops, but still part of the serialized byte code) have to be valid Vars. I suppose we could encode raw int32’s in the table and indicate somehow which ones those are, but that seems like a lot of trouble to go to for this purpose.
-- Paul
From: Richard notifications@github.com Sent: Tuesday, April 14, 2020 3:03 PM To: microsoft/ChakraCore ChakraCore@noreply.github.com Cc: Paul Leathers pleath@microsoft.com; Mention mention@noreply.github.com Subject: Re: [microsoft/ChakraCore] Questions about JsBulitIn (#6410)
Kept looking - the bytecode emitter does a check to see whether string concat is more efficient than ADD_A using this method that involves comparing to the size of a pointer: https://github.com/microsoft/ChakraCore/blob/6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764/lib/Runtime/Library/CompoundString.cpp#L173-L184https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fblob%2F6e1bf5a86fd1d2ea77e1f326a21752fa3ddca764%2Flib%2FRuntime%2FLibrary%2FCompoundString.cpp%23L173-L184&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716460287&sdata=9pIxpWJwWhlzyZIXU1izmePw%2FNoghhNDRdfhsfxjia0%3D&reserved=0
As for the constants being LD_C_I4 vs LD_A, I'm unsure where the logic is, the numbers are: -2,146,828,260 and -2,146,828,281 which both look small enough to fit inside an INT32 - unless is it that tagged ints on 32bit CC are actually int28s or equivalent?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FChakraCore%2Fissues%2F6410%23issuecomment-613703720&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716470282&sdata=fcIFQgEWwr6PoHdlnTAw1Ihm6J8xgf99%2BHIoVeTM%2FCY%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADYXYRCPOB2RJV4M4M6MPFTRMTMQTANCNFSM4MF2JQ4Q&data=02%7C01%7Cpleath%40microsoft.com%7Ce04c94b51b854d9deba108d7e0bf92c3%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637224985716470282&sdata=1JOydx5OU1xfhNAalIvAZfO%2Bkjmg6roZVo2apTVuPYA%3D&reserved=0.
I'm thinking if the aim is just to remove the difference in the intl bytecode - I can edit intl.js so these differences don't occur and rely on CI telling us if we've ever reintroduced them (or any other differences).
For the string concat - can pre-combine in the JS, the concat is only needed to fit a comment in the middle, can move the comment
For the Ints - this is used for comparing against error numbers to detect OOM and SE exceptions so it's not on a fast path - could store these as strings and use toNumber on them to take the Ints out of the bytecode.
May be a lot simpler than editing the bytecode emitter for this.
I like the solution in #6425. I think eliminating the checked-in header would be even better, though I am not sure how to integrate that into MSBuild.
I personally had briefly looked into doing bytecodegen during the build, and came to the conclusion that MSBuild makes it tricky, since you basically have to build chakracore.dll twice.
@jackhorton, just to entertain a possibility, what would you think about CMake generating MSBuild files?
I've used that for LLVM, but switched to Ninja as it close to what I am used to.
Well I've merged #6425 so windows remains the same but for now devs working on macOS and Linux can regen the bytecode if they need to. Definitely would be better to find a way to not need the bytecode to be checked in at all - but this works for now.
I think we had experimented with it in the past, but the msbuild files we used were highly artisinal (and, imo, fairly high-quality) due to our team's familiarity with the vcxproj system and our requirements for building ChakraFull in the windows build system. As that is a non-goal going forward, and since most people are more comfortable with CMake than vcxproj, using only CMake going forward sounds reasonable to me. For what its worth, you can also have CMake generate ninja files on Windows, if I remember correctly.
The JsBuiltIn system makes cross-platform development difficult, since:
Also, dealing with bytecode regeneration is a constant source of conflicts and annoyance.
So a couple of questions for the OG Chakra devs:
@pleath @boingoing @jackhorton