Closed GoogleCodeExporter closed 9 years ago
To build v8 with MinGW (which is needed in order to link it properly) you need
to do the following:
1) Open Sconstruct and src/Sconscript, and replace "Environment()" with "(Environment(tools = ['mingw']))"
2) src/v8utils.h needs the following:
#ifdef __GNUC__
#include <stdarg.h>
#endif
3) platform-win32.cc requires the long form of strncp_s:
strncpy_s(name_, sizeof(name_), name, sizeof(name_));
...instead of:
strncpy_s(name_, name, sizeof(name_));
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 4:04
Also need to add libraries:
ws2_32
wsock32
winmm
...to the linker.
-------------------------------
Might also want to change "-O3" to "-Os" in the Sconstruct file. (Or not, see
below)
-O3 isn't worth it on GCC 4.5. It might be worth it on 4.6. Since this is a
library, we might consider -O3.
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 4:29
The v8 library needs (seems to need) _WIN32_WINNT set to 0x0501. This would
bump the minimum Windows version for Wait Zar up from Windows 2000 to Windows
20003 or XP.
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 4:38
The linker might also need "-static" set.... not sure about this one, but
trying it out.
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 4:46
NOTE: We can keep the Windows version at 500 (Win2k).
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 5:02
This makes the EXE 5MB. Might have to leave this functionality in the DLL....
but I'd really like it for reading JSON without the need for Boost.
Original comment by seth.h...@gmail.com
on 16 Jan 2011 at 5:05
Some fun notes:
1) A mundane (recursive descent?) library for parsing JSON, json-cpp, is nice and tiny (5~8 files), has some notion of "comments" already, and seems pretty fast.
http://jsoncpp.sourceforge.net/
So, that removes the boost dependency. Now, we can implement any scripting
language we want. Maybe lua?
Alternatively, we can implement a language (like LUA) and then load a LUA
library that parses JSON.
Original comment by seth.h...@gmail.com
on 17 Jan 2011 at 9:05
Note: LUA is very tiny (~800 kb source), and, though written in C, compiles as
C++ as well. It is also extremely stable.
Now reading the syntax of Lua....
Original comment by seth.h...@gmail.com
on 17 Jan 2011 at 9:25
Building Lua in release mode takes 15s. It creates a DLL 248 kb in size.
Building as a static library creates a .lib which is about the same size. So,
we'll most likely want to just build a DLL and leave Lua as an extension.
Original comment by seth.h...@gmail.com
on 18 Jan 2011 at 5:34
We should give the Lua DLL a checksum, so that we can (partially) avoid people
substituting an older/newer version of Lua.
I doubt people will try to hack WZ that way, so we can just store these
checksums in the config files, and have a flag "verify-md5-checksum" that's
On/Off.
Original comment by seth.h...@gmail.com
on 18 Jan 2011 at 5:39
Lua requires some hacking to get Unicode working:
http://lua-users.org/wiki/UnicodeIdentifers
http://lua-users.org/wiki/ValidateUnicodeString
http://luaforge.net/projects/sln/
Seems like it's not too hard. (Have to check if slnunicode supports pattern
matching).
Original comment by seth.h...@gmail.com
on 18 Jan 2011 at 6:54
Note that Lua doesn't have full regexes; rather, it has more of character-level
patterns. This is bad for, e.g., kinzi and stacked letters.
One option I overlooked was using json-cpp to parse the JSON, then compile V8
as a DLL (using VC++) and loading it dynamically. We could write some kind of
scaffolding function which try{}catch{}'d the entire thing (and compile that
with VCC), then returned an error code if something went wrong, since MinGW
won't be able to catch an exception from a DLL.
This might be the best option; I'll write a simple test script for this later.
Original comment by seth.h...@gmail.com
on 19 Jan 2011 at 5:56
I wrote a test script for json-cpp. Basically, we have to treat the string as
UTF-8, and only convert it to wchar_t* when we actually need the value.
Sample (and library) code attached.
Note that we could define JSON_VALUE_USE_INTERNAL_MAP if we want objects to be
stored as maps instead of vectors. Currently, config files are small, so I see
no need to do this.
(This point is irrelevant for now; JSON_VALUE_USE_INTERNAL_MAP will trigger a
union error in MinGW).
Original comment by seth.h...@gmail.com
on 19 Jan 2011 at 9:09
Attachments:
Replaced Json_Spirit with Json_CPP. Removed Boost as well. Total build time is
down to 149 seconds.
This also allows us to focus on scripting (Javascript or Lua) using a DLL only.
Original comment by seth.h...@gmail.com
on 19 Jan 2011 at 10:38
v8 accepts either UTF-8 or unsigned short* arrays. Both require some conversion
from wstrings, so the choice of which format to use will depend on how V8 (or
Javascript in general) handles Unicode internally.
Original comment by seth.h...@gmail.com
on 20 Jan 2011 at 5:46
From the mailing list:
> The internal storage format for strings are either ASCII (one byte per char)
> or UTF-16 (two bytes per string). So any UTF-8 string which has only ASCII
> characters is stored as ASCII otherwise UTF-8 is converted to UFT-16. The
> fact that there is no uint16_t version of NewSymbol in the API is mainly
> because no-one has added it.
So, we'll use the UTF-16 variant.
Original comment by seth.h...@gmail.com
on 20 Jan 2011 at 5:54
Note: on Windows, wchar_t and uint16_t are the same size. So we should be able
to fast-convert the array.
Maybe with a pointer conversion?
uint16_t* arr = &wstring().c_str()[0];
Is this a bad idea? It's certainly fast.
Original comment by seth.h...@gmail.com
on 20 Jan 2011 at 6:01
It's even easier.
For input into v8:
uint16_t* x = (uint16_t*)(src.c_str());
For output from v8:
wstring myresStr((wchar_t*)*myres);
Note that, somewhere in the code, we should ensure that:
sizeof(wchar_t) == sizeof(uint16_t)
...just to give anyone compiling WZ on a 64-bit system some warning.
Original comment by seth.h...@gmail.com
on 20 Jan 2011 at 6:19
Interestingly enough, most of the security concerns I was worried about aren't
in ECMA-script at all, but in the various browser addons.
For example, the following objects are not defined:
* document
* fopen
* xmlHttpRequest
* alert
This might turn out to be a decent option after all.
Original comment by seth.h...@gmail.com
on 20 Jan 2011 at 6:58
Wrote a small driver program and compiled it into V8. The symbol is definitely
there (checked with objdump).
Next up:
1) Visual Studio: Use LoadLibrar() etc. (no *.lib file) to load the DLL.
2) MinGW: Repeat
3) MinGW: Again, with UPX'd dll
4) Port into Wait Zar proper, with all the config file fun that entails.
Original comment by seth.h...@gmail.com
on 21 Jan 2011 at 7:11
1) works, after a dash of extern "C"
Original comment by seth.h...@gmail.com
on 22 Jan 2011 at 5:41
2) Done.
Original comment by seth.h...@gmail.com
on 22 Jan 2011 at 7:57
3) Done
Time to think about config fun-ness.
Original comment by seth.h...@gmail.com
on 22 Jan 2011 at 8:10
Note that loading the DLL into memory takes about 3.5MB. Not a big deal (since
its compact & only 700kb on disk), but this enforces the idea that we'll need
the ability to disable DLLs.
For configs, I'm thinking of something like this:
"languages.myanmar.tranformations" :
{
"uni2ayar" :
{
"from-encoding" : "unicode",
"to-encoding" : "ayar",
"type" : "javascript",
"source-file" : "uni2ayar.js",
}
}
That handles the transformation. (Note that uni2ayar.js is located in the
current directory, as with most config settings). If "javascript" is disabled
(or the DLL is missing) then this simply discards the transformation.
Now, for the DLL loading, we should probably have a directory called
"config/Common" which contains them. All DLLs must load using a path relative
to Common (no sub-directories either). DLLs may have an MD5 checksum, and may
be disabled. We'll need a new top-level directive (like "languages" and
"settings"), and these should be resolved _first_ in resolvePartialSettings().
Something like this:
"extensions" :
{
"javascript" :
{
"library-file" : "v8_wz.dll",
"enabled" : "yes",
"md5-hash" : "6CC7C73271E31F5D4AD48BCACD27A4EB",
"check-md5" : "yes"
},
#More
}
Note that, since Loading and Unloading a DLL leaves a good deal of memory
remaining (1MB), then it makes sense to simply load the DLL (unless disabled),
test for the conversion function, then leave it open.
After we implement this option, the first thing to do will be to check for
memory leaks.
The second thing to do will be to implement a "fallback" function. For example,
disabling javascript might disable Ayar, but it shouldn't disable, say,
Burglish.
Original comment by seth.h...@gmail.com
on 24 Jan 2011 at 4:43
Done. There's some minor problems reporting errors (and fallbacks) but I think
that will require re-writing the config parser a bit.
Original comment by seth.h...@gmail.com
on 25 Jan 2011 at 7:46
Original issue reported on code.google.com by
seth.h...@gmail.com
on 7 Mar 2010 at 1:08