jart / cosmopolitan

build-once run-anywhere c library
ISC License
17.84k stars 610 forks source link

Compiling Python #141

Closed ahgamut closed 1 year ago

ahgamut commented 3 years ago

https://github.com/ahgamut/python27
https://github.com/ahgamut/cpython/tree/cosmo_py27

The assert macro needs to be changed in cosmopolitan.h to enable compilation (see #138). Afterwards, just clone the repo and run superconfigure.

Python 2.7.18 compiled seamlessly once I figured out how autoconf worked, and what flags were being fed to the source files when running make. I'm pretty sure we can compile any C-based extensions into python.exe -- they just need to compiled/linked with Cosmopolitan, with necessary glue code added to the Python source. For example, I was able to compile SQLite into python.exe to enable the internal _sqlite module.

The compiled APE is about 4.1MB with MODE=tiny (without any of the standard modules, the interpreter alone is around 1.6MB). Most of the modules in the stdlib compile without error. The _socketmodule (required for Python's simple HTTP server) doesn't compile, as it requires the structs from netdb.h.

On Windows, the APE exits immediately because the intertpreter is unable to find the platform-specific files. Module/getpath.c and Lib/site.py in the Python source try to use absolute paths from the prefixes provided during compilation; Editing those files to search the right locations (possibly with some zipos magic) ought to fix this.

Keithcat1 commented 2 years ago

I wonder if Libffi could be used here in some way?

On 10/9/21, Gautham @.***> wrote:

@jart one possible way to add many small speedups is to use METH_FASTCALL instead of METH_VARARGS when specifying the arguments for a CPython function (a Python function written in C). METH_FASTCALL prevents creation of an unnecessary Python tuple of args, and instead just uses an array of PyObject *. The tradeoff is that there is some boilerplate to write to use METH_FASTCALL properly.

METH_FASTCALL went through a lot of changes in Python 3.7 (git log -i --grep="FASTCALL" between Python v3.7.12 and v3.6.15), and is much faster + nicer to use.

METH_FASTCALL is still considered an internal detail of Python 3.7 -- I hope this means it can be added to APE Python without any major compatibility issues.

Example of manually adding METH_FASTCALL (both the below commits cannot be ported to the monorepo ATM, because Py_Arg_UnpackStack and a bunch of other internals need to be moved first):

  • @.***
  • @.***

An automated way of adding METH_FASTCALL in Python is to use the Argument Clinic generator and generate the necessary clinic/*.inc files.

I tried using Argument Clinic + METH_FASTCALL in 3.6 in https://github.com/ahgamut/cpython/commit/2b417a690735b8f004257eddc526ece66dd135b4 for a few methods. The related tests pass, but no noticeable speedup.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/jart/cosmopolitan/issues/141#issuecomment-939307979

Keithcat1 commented 2 years ago

I found a weird bug when running APE Python on Windows. Create a file bug.py: import nonexistant_module then from the command line: python.com bug.py Traceback (most recent call last): File "t.py", line 1, in ... And then Python just sits there and freezes instead of exiting.

pkulchenko commented 2 years ago

@Keithcat1, this looks very similar to the frame unwinding issue I came across while compiling LuaJIT (also on windows). It doesn't happen for @ahgamut, but it does happen for me on both Windows and WSL2, so maybe related to the Windows platform. I found a "fix", but it's not really fixing the underlying issue.

ahgamut commented 2 years ago

Trying some benchmarks might be fun

I compared APE Python to Python 3.6.15 by using pyperformance benchmark test suite. I compiled Python 3.6.15 with -O3, so I used MODE=opt for the APE.

Setting up pyperformance to test the APE requires some manual setup: pyperformance uses subprocess to call the Python executable, so because of #120 I had to wrap the APE in a simple shell script or change the subprocess call (basically change python.com $@ to sh python.com $@). I think this affects measuring the performance of the APE. Can we have a build mode (MODE=optlinux) that produces optimized ELF executables (or some other workaround) so that #120 doesn't happen?

@jart some important notes:

Here's the raw table numbers for reference:

Benchmark py36-15 APE-python
pickle_dict 57.5 us 41.2 us: 1.39x faster
pickle_list 8.10 us 6.23 us: 1.30x faster
pickle 20.1 us 18.0 us: 1.11x faster
json_loads 51.5 us 48.7 us: 1.06x faster
pathlib 39.5 ms 37.7 ms: 1.05x faster
unpickle 28.2 us 27.0 us: 1.04x faster
raytrace 1.23 sec 1.18 sec: 1.04x faster
unpickle_pure_python 796 us 772 us: 1.03x faster
go 555 ms 538 ms: 1.03x faster
pickle_pure_python 1.07 ms 1.04 ms: 1.03x faster
deltablue 17.2 ms 16.8 ms: 1.02x faster
scimark_sor 466 ms 459 ms: 1.02x faster
regex_compile 384 ms 379 ms: 1.01x faster
xml_etree_parse 277 ms 273 ms: 1.01x faster
telco 14.3 ms 14.1 ms: 1.01x faster
xml_etree_iterparse 222 ms 220 ms: 1.01x faster
xml_etree_generate 223 ms 221 ms: 1.01x faster
float 228 ms 229 ms: 1.00x slower
meteor_contest 199 ms 200 ms: 1.01x slower
hexiom 22.7 ms 22.9 ms: 1.01x slower
pidigits 293 ms 295 ms: 1.01x slower
nqueens 206 ms 207 ms: 1.01x slower
scimark_sparse_mat_mult 7.57 ms 7.64 ms: 1.01x slower
regex_effbot 4.97 ms 5.02 ms: 1.01x slower
scimark_monte_carlo 222 ms 225 ms: 1.01x slower
logging_silent 745 ns 755 ns: 1.01x slower
regex_v8 43.1 ms 43.7 ms: 1.01x slower
xml_etree_process 178 ms 181 ms: 1.02x slower
chaos 247 ms 255 ms: 1.04x slower
pyflate 1.45 sec 1.51 sec: 1.04x slower
fannkuch 997 ms 1.04 sec: 1.04x slower
crypto_pyaes 214 ms 224 ms: 1.05x slower
regex_dna 281 ms 294 ms: 1.05x slower
scimark_fft 624 ms 653 ms: 1.05x slower
unpickle_list 7.27 us 7.72 us: 1.06x slower
nbody 245 ms 261 ms: 1.07x slower
unpack_sequence 85.1 ns 90.7 ns: 1.07x slower
spectral_norm 254 ms 273 ms: 1.07x slower
python_startup 17.3 ms 18.9 ms: 1.09x slower
python_startup_no_site 10.8 ms 12.0 ms: 1.11x slower
json_dumps 24.7 ms 27.8 ms: 1.13x slower
logging_simple 20.4 us 23.7 us: 1.16x slower
logging_format 23.7 us 27.5 us: 1.16x slower
Geometric mean (ref) 1.00x slower

Benchmark hidden because not significant (2): scimark_lu, richards Ignored benchmarks (1) of py36-15.json: 2to3 Ignored benchmarks (1) of APE-python.json: sqlite_synth

ahgamut commented 2 years ago

Here are the steps to obtain the above benchmark measurement:

  1. Obtain a Python (I used Python 3.7) with pyperformance installed. Let's call this the <marker> Python. Note down the location of <marker>/bin/python, it will also contain <marker>/bin/pyperformance.
  2. Obtain all other necessary member Python versions (including APE builds) and store them separately.
  3. Install pyperf and pyaes for each member Python. For the APE builds, this means byte-compiling and adding the source code of these packages into .python/ in the APE ZIP store.
  4. Write a simple shell script wrapper for each APE to avoid #120.
#!/usr/bin/env sh
location/of/the/APE/python-opt.com $@

Alternatively, modify run.py in pyperformance to add sh when dealing with APEs:

def run_command(command, hide_stderr=True):
    if hide_stderr:
        kw = {'stderr': subprocess.PIPE}
    else:
        kw = {}
# above code is unchanged
    if ".com" in command[0]: # add these two lines
        command.insert(0, "sh")
# below code is unchanged
  1. for each member Python that you want to benchmark, run
<marker>/bin/pyperformance run --python=/location/of/member/python --inside-venv -o member-performance.json
  1. the above comparison table can be generated with <marker>/bin/pyperf:
    <marker>/bin/pyperf compare_to py36-15.json APE-python.json --table --table-format md -G
jart commented 2 years ago

Oh wow we're the fastest at pickling.

  1. Could you run the benchmark suite again under MODE=optlinux?
  2. Could we add a benchmark for audioop.add() so we can lay claim to 10x improvements?

The new optlinux mode should give you the boost you were expecting, because it enables red zone and disables frame pointers. For example, here's Musl Libc:

$ time python3.6 -m test.test_pickle
real    0m2.730s
user    0m2.655s
sys     0m0.074s

Here's MODE=opt:

$ time o/opt/third_party/python/pythontester.com -m test.test_pickle
real    0m2.535s
user    0m2.455s
sys     0m0.080s

Here's MODE=optlinux:

$ time o/optlinux/third_party/python/pythontester.com -m test.test_pickle
real    0m2.353s
user    0m2.270s
sys     0m0.082s

Please be advised that disabling frame pointers isn't worth the performance gain in practice, since it takes away things like backtraces. Speaking of which, @pkulchenko we now have more solid code for stack unwinding and such, which shouldn't ever crash. The recommended algorithm is this:

  size_t gi = __garbage->i;
  const struct StackFrame *frame = __builtin_frame_address(0);
  for (frame = bp; frame; frame = frame->next) {
    if (!IsValidStackFramePointer(frame)) {
      __printf("%p corrupt frame pointer\n", frame);
      break;
    }
    addr = frame->addr;
    if (addr == (intptr_t)&__gc) {
      do --gi;
      while ((addr = garbage->p[gi].ret) == (intptr_t)&__gc);
    }
    __printf("%p %p %s\n", frame, addr, __get_symbol_by_addr(addr));
  }

If our Python takes 10% longer to startup then that's impressive if you consider APE binaries are shell scripts and it has to DEFLATE all .pyc / .py files it loads from the ZIP structure. APE loading is going to be more important as apps grow larger, because bloated Python code spends most of its time crawling the filesystem if you strace it. With the ZIP central directory we can make that an effectively O(1) pure userspace operation. Right now the LIBC_ZIPOS code isn't as optimized as it can be, but to give you a basic idea of how important a boost this can be, consider this benchmark:

 *     stat syscall        l:       892๐‘       288๐‘›๐‘    m:     1,024๐‘       331๐‘›๐‘ 
 *     stat() fs           l:       915๐‘       296๐‘›๐‘    m:       981๐‘       317๐‘›๐‘ 
 *     stat() zipos        l:       135๐‘        44๐‘›๐‘    m:       169๐‘        55๐‘›๐‘ 
     1,732โฐ     1,109โณ   1,100k       0iop o/rel/test/libc/calls/stat_test.com -b

This is basically what Java does because enterprise apps are always humongous, so having .jar files designed as .zip files was a really brilliant move on Java's part that allowed it to meet the requirements of big companies. So in many ways you could think of Actually Portable Python as an Enterprise Python even though we're a scrappy open source project. There should ideally be some way for the benchmarks to capture that, once we optimize it more. I also spotted an issue that helps us save 100 cycles on read() and write() function calls. So i/o should be a little bit better now.

PGO is something I want. I looked into doing it and came to the conclusion that I'd prefer to kick that can down the road. We can do what PGO does manually for the time being, by using the COUNTBRANCH() macro. You can wrap any expression with that, and it'll cause the linker to generate code that prints the percentage of times the branch was taken at the end of the program. If it ends up being 99% yes or 1% no then you would then add LIKELY() or UNLIKELY() macros around the expression. It makes a measurable difference because the Python codebase is written in such a way that error handling code usually blocks the critical path, and everything counts in small amounts.

ahgamut commented 2 years ago

Actually Portable Python under MODE=optlinux is slightly (1.02x) faster than vanilla Python 3.6.15 on the pyperformance benchmark! MODE=optlinux startup times are almost equivalent.

Benchmark py36-15 APE-optlinux
pickle_dict 57.5 us 39.6 us: 1.45x faster
pickle_list 8.10 us 6.21 us: 1.30x faster
pickle 20.1 us 17.2 us: 1.16x faster
unpickle 28.2 us 25.8 us: 1.09x faster
scimark_sparse_mat_mult 7.57 ms 6.98 ms: 1.08x faster
logging_silent 745 ns 691 ns: 1.08x faster
telco 14.3 ms 13.2 ms: 1.08x faster
pickle_pure_python 1.07 ms 995 us: 1.08x faster
raytrace 1.23 sec 1.14 sec: 1.08x faster
json_loads 51.5 us 47.9 us: 1.07x faster
xml_etree_generate 223 ms 208 ms: 1.07x faster
unpickle_pure_python 796 us 746 us: 1.07x faster
pathlib 39.5 ms 37.0 ms: 1.07x faster
regex_compile 384 ms 363 ms: 1.06x faster
go 555 ms 526 ms: 1.06x faster
richards 171 ms 163 ms: 1.05x faster
xml_etree_parse 277 ms 264 ms: 1.05x faster
deltablue 17.2 ms 16.4 ms: 1.05x faster
xml_etree_iterparse 222 ms 215 ms: 1.04x faster
nqueens 206 ms 199 ms: 1.03x faster
python_startup_no_site 10.8 ms 10.5 ms: 1.03x faster
float 228 ms 222 ms: 1.03x faster
unpickle_list 7.27 us 7.11 us: 1.02x faster
scimark_sor 466 ms 458 ms: 1.02x faster
scimark_lu 486 ms 478 ms: 1.02x faster
chaos 247 ms 242 ms: 1.02x faster
hexiom 22.7 ms 22.4 ms: 1.02x faster
scimark_monte_carlo 222 ms 219 ms: 1.02x faster
xml_etree_process 178 ms 175 ms: 1.02x faster
crypto_pyaes 214 ms 212 ms: 1.01x faster
scimark_fft 624 ms 619 ms: 1.01x faster
pidigits 293 ms 293 ms: 1.00x faster
regex_effbot 4.97 ms 5.00 ms: 1.01x slower
python_startup 17.3 ms 17.4 ms: 1.01x slower
regex_v8 43.1 ms 43.4 ms: 1.01x slower
spectral_norm 254 ms 258 ms: 1.02x slower
nbody 245 ms 249 ms: 1.02x slower
regex_dna 281 ms 288 ms: 1.02x slower
unpack_sequence 85.1 ns 91.3 ns: 1.07x slower
json_dumps 24.7 ms 27.0 ms: 1.09x slower
logging_format 23.7 us 27.1 us: 1.14x slower
logging_simple 20.4 us 23.3 us: 1.14x slower
fannkuch 997 ms 1.64 sec: 1.64x slower
Geometric mean (ref) 1.02x faster

Benchmark hidden because not significant (2): meteor_contest, pyflate Ignored benchmarks (1) of py36-15.json: 2to3 Ignored benchmarks (1) of APE-optlinux.json: sqlite_synth

ahgamut commented 2 years ago

Could we add a benchmark for audioop.add() so we can lay claim to 10x improvements?

Pyston is looking to add custom benchmark suites to pyperformance (https://github.com/python/pyperformance/pull/109) , so if we have a bunch of benchmark scripts, we could do something similar.

PGO is something I want. I looked into doing it and came to the conclusion that I'd prefer to kick that can down the road. We can do what PGO does manually for the time being, by using the COUNTBRANCH() macro. You can wrap any expression with that, and it'll cause the linker to generate code that prints the percentage of times the branch was taken at the end of the program. If it ends up being 99% yes or 1% no then you would then add LIKELY() or UNLIKELY() macros around the expression. It makes a measurable difference because the Python codebase is written in such a way that error handling code usually blocks the critical path, and everything counts in small amounts.

IIRC there are many functions with if(condition) goto error; blocks which may benefit from the UNLIKELY macro. Let me try it out. Can we try just LTO to see if it has any benefits?

Also, Python3.6 is the last of the "slow" Python3 versions (will reach EOL in two months). From Python 3.7 onwards they've focused on improving performance: git log --grep between 3.7 and 3.6 shows a bunch of PRs with "performance/faster/speedup" in the description.

Keithcat1 commented 2 years ago

It seems that compiling Python in MODE=rel no longer excludes the .py files, the binary is about the same size as the Python binary built in the default mode.

Keithcat1 commented 2 years ago

And here's a build break that I've been running into. It seems to occur reguardless of the mode, but it doesn't stop you from just running the make again for some reason.

error:o/build/bootstrap/zipobj.com.tmp.13158: check failed: 0xffffffffffffffff != 0xffffffffffffffff (2) 7ffeaf549660 0000004018b5 NULL 7ffeaf549690 000000403b37 NULL 7ffeaf5496d0 0000004038d3 NULL 7ffeaf549700 0000004013fa NULL 7ffeaf549710 0000004015a3 NULL 7ffeaf549720 00000040116a NULL

make MODE=dbg -j4 o/dbg/third_party/python/Lib/venv/scripts/nt/Activate.ps1.zip.o exited with 77: build/bootstrap/zipobj.com -b0x400000 -P.python -C3 -o o/dbg/third_party/python/Lib/venv/scripts/nt/Activate.ps1.zip.o third_party/python/Lib/venv/scripts/nt/Activate.p s1 consumed 126,282ยตs wall time ballooned to 2,100kb in size needed 15,829us cpu (43% kernel) caused 597 page faults (99% memcpy) 208 context switch (97% consensual) performed 480 read and 0 write i/o operations

make: *** [build/rules.mk:78: o/dbg/third_party/python/Lib/venv/scripts/nt/Activate.ps1.zip.o] Error 77

ahgamut commented 2 years ago

@jart auto-complete interferes with indentation in the REPL:

>>> def function(x):
>>>    return x+1 # have to press 4 spaces here, because <tab> auto-completes to __name__
jart commented 2 years ago

That sounds like an easy fix. I'll probably have to change the linenoise api to do it, but it wouldn't need to be a breaking change.

ahgamut commented 2 years ago

@jart Regarding the documentation for APE Python: since the CPython docs use sphinx, I thought it would be easier if sphinx was available.

Here are the steps to get there:

Sample screenshot:

2022-01-09_01-23-45_994x535

ahgamut commented 2 years ago

Documentation is being updated here.

ahgamut commented 2 years ago

Turns out Python 3.9 can be built with Cosmopolitan Libc if you just ifdef out pthreads: https://github.com/ahgamut/cpython/tree/cosmo_py39

Careful: It's not as polished as the python.com in the repo: I haven't tested how threads work in the above APE (_threadmodule.c compiled without any warnings though).

ahgamut commented 2 years ago

Python 3.9 APE can run without needing pthreads, similar to Python 3.6.

The only difference is that _dummy_thread.py has to be written at the C level in Py39, and some multithreading-related tests will fail because the default Py39 build expects threads to be available.

Screenshot of python39.com -m http.server built from here:

2022-03-07_05-59-05_1366x736

http.server requires threads in Python 3.9, but I used WITH_THREAD for _dummy_thread-like behavior when starting a new thread, so it works.

ahgamut commented 2 years ago

The BLIS Linear Algebra Library builds with Cosmopolitan Libc under the generic target. Now numpy is just a few build configs away :)

https://github.com/ahgamut/blis/tree/cosmopolitan (run superconfigure, and then run the examples in examples/oapi)

The library builds and works on the examples without linking pthreads -- this is the second time my believing the documentation has led me astray.

@jart the constants in libc/sysv/consts/baud.h clash with some macros in BLIS.

jart commented 2 years ago

That's great news! Send a PR? Let's fix those clashes too.

jart commented 2 years ago

As for threads, we're slowly but surely making progress on that front. Malloc and other stuff has been made thread safe.

ahgamut commented 2 years ago

BLIS now builds under the penryn microarchitecture (aka Intel Core2, as specified here) with Cosmopolitan Libc, but requires removal of -fno-omit-frame-pointer and -pg.

The testsuite passes a lot of tests, but my system OOMs (>10GiB) before all of them complete.

Keithcat1 commented 2 years ago

I got a build error while trying: make -j4 -O o//third_party/python/python.com The important line seems to be:

error: "usr/share/ssl/root/" (o//net/https/sslroots.o) not defined by direct deps of o//net/https/https.a.pkg

Let me know if you need all of it.

ahgamut commented 2 years ago

@Keithcat1 I think I've seen this one before. Can you try building make -j4 o//net/https first and then try make -j4 o//third_party/python/python.com?

Delete the o/usr folder before trying this. If you see the backtrace even with make -j4 o//net/https please post the backtrace here.

@jart I think this is the error where the zipobj.com doesn't write the symbol table entry correctly. For reference, see below

readelf -Wa o//usr/share/ssl/root/amazon.pem.zip.o shows (note relative path missing):

Symbol table '.symtab' contains 13 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000000000    40 OBJECT  LOCAL  DEFAULT    4 zip+lfile:amazon.pem
    10: 0000000000000000   294 OBJECT  LOCAL  DEFAULT    5 zip+cdir:amazon.pem
    11: 0000000000000028  2692 OBJECT  GLOBAL DEFAULT    4 amazon.pem
    12: 0000000000000000     0 OBJECT  GLOBAL HIDDEN   UND __zip_start

My guess is that the StripComponents function is doing something weird, but I have not been able to trigger the error in a defined manner yet.

Keithcat1 commented 2 years ago

It worked. Thank you. My next bug is that I did: make -j4 MODE=rel -O o/rel/third_party/python/python.com and it did not remove all the .py files.

jart commented 2 years ago

BLIS now builds under the penryn microarchitecture (aka Intel Core2, as specified [here (https://github.com/flame/blis/blob/master/docs/HardwareSupport.md)) with Cosmopolitan Libc, but requires removal of -fno-omit-frame-pointer and -pg.

That's perfectly fine. We don't need function call tracing for performance critical code like BLIS. It's perfectly safe to disable that. Provided you don't need the rich debugging.

jart commented 2 years ago

My guess is that the StripComponents function is doing something weird, but I have not been able to trigger the error in a defined manner yet.

Please keep an eye on that and let me know. If you can help me understand what's going on then I'll surely fix it.

Keithcat1 commented 2 years ago

If you do:

bytearray(12**10)

I get this: die failed while dying
Where CPython would throw a MemoryError, because I asume that of course it ran out of memory.

ahgamut commented 2 years ago

@Keithcat1 python.com built with MODE= from commit e4d6e263d4c2161d does throw a MemoryError for the example you mentioned

Traceback (most recent call last):
  File "sample.py", line 1, in <module>
    a = bytearray(12**10)
MemoryError
Keithcat1 commented 2 years ago

I tried that and now it just exits without any message whatsoever, like this:


C:\py>cd git

C:\py\git>rd /s /q cosmopolitan

C:\py\git>wsl
[sudo] password for keith:
keith@keith-pc:/mnt/c/py/git$ git clone https://github.com/jart/cosmopolitan
Cloning into 'cosmopolitan'...
remote: Enumerating objects: 76289, done.
remote: Counting objects: 100% (1502/1502), done.
remote: Compressing objects: 100% (667/667), done.
remote: Total 76289 (delta 984), reused 1126 (delta 832), pack-reused 74787
Receiving objects: 100% (76289/76289), 104.96 MiB | 2.75 MiB/s, done.
Resolving deltas: 100% (41558/41558), done.
Updating files: 100% (19501/19501), done.
keith@keith-pc:/mnt/c/py/git$ cd cosmopolitan
keith@keith-pc:/mnt/c/py/git/cosmopolitan$
keith@keith-pc:/mnt/c/py/git/cosmopolitan$ git checkout e4d6e26
Note: switching to 'e4d6e26'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at e4d6e263d Rename ParseJson() to DecodeJson() for consistency
keith@keith-pc:/mnt/c/py/git/cosmopolitan$ make -j1 MODE= -O o//net/https
...
keith@keith-pc:/mnt/c/py/git/cosmopolitan$ make -j1 MODE= -O o//third_party/python/python.com                                                                           
...
keith@keith-pc:/mnt/c/py/git/cosmopolitan$ exit
logout

C:\py\git>cd cosmopolitan

C:\py\git\cosmopolitan>o\third_party\python\python.com
Python 3.6.14+ (Actually Portable Python) [GCC 9.2.0] on cosmo
Type "help", "copyright", "credits" or "license" for more information.
>>: q=bytearray(12**10)

C:\py\git\cosmopolitan>o\third_party\python\python.com
Python 3.6.14+ (Actually Portable Python) [GCC 9.2.0] on cosmo
Type "help", "copyright", "credits" or "license" for more information.
>>: bytearray(12**10)
Keithcat1 commented 2 years ago

When run under WSL, python.com does throw on a MemoryError. On Windows?

Python 3.6.14+ (Actually Portable Python) [GCC 9.2.0] on cosmo
Type "help", "copyright", "credits" or "license" for more information.
>>: q=bytearray(12**10)
7000001fee50 000001791a7f OnUnrecoverableMmapError+31
7000001fee70 00000049c326 MapMemories+1030
7000001fef00 000001793ac4 Mmap+1108
7000001fef90 00000179401c mmap+76
7000001feff0 00000178c2f3 dlmalloc_requires_more_vespene_gas+67
7000001ff010 000001781242 mmap_alloc.constprop.0+98
7000001ff070 000001785f8a sys_alloc.constprop.0+474
7000001ff0d0 000001786c08 dlmalloc+504
7000001ff140 00000178c0c8 dlmemalign+40
7000001ff150 00000186ade4 __asan_allocate+100
7000001ff1a0 00000186afcb __asan_memalign+139
7000001ff210 0000010b5513 _PyMem_RawMalloc+19
7000001ff220 0000010b6e12 _PyMem_DebugRawAlloc+658
7000001ff280 0000010b74aa _PyMem_DebugRawRealloc+842
7000001ff2d0 0000010b771e _PyMem_DebugRealloc+14
7000001ff2e0 0000010b8701 PyObject_Realloc+33
7000001ff2f0 000000eefe97 PyByteArray_Resize+967
7000001ff360 000000ef13fc bytearray_init+1564
7000001ff510 000001145a01 type_call+433
7000001ff560 000000f471cb _PyObject_FastCallKeywords+731
7000001ff5b0 0000013f43cd _PyEval_EvalFrameDefault+55341
7000001ff8a0 0000013e2706 _PyEval_EvalCodeWithName+10998
7000001ff9a0 0000013e3f14 PyEval_EvalCodeEx+100
7000001ffa50 0000013e3f74 PyEval_EvalCode+36
7000001ffa90 00000153d100 run_mod+64
7000001ffac0 0000015474d1 PyRun_InteractiveOneObjectEx+1265
7000001ffb80 0000015480a4 PyRun_InteractiveLoopFlags+212
7000001ffc10 000001549563 PyRun_AnyFileExFlags+67
7000001ffc40 000000840408 run_file+504
7000001ffc90 000000842438 Py_Main+6696
7000001ffe40 0000004c2fba RunPythonModule+1306
7000001fff20 000000402d44 main+148
7000001fffd0 00000040ab26 cosmo+71
7000001fffe0 000001795585 _jmpstack+22
ahgamut commented 2 years ago

@Keithcat1 I'm just guessing here, but does regular stderr work on Windows? Try print("hi", file=sys.stderr) in python.com -- if it doesn't print anything then its probably at the python level and we can change that in pythonrun.c or wherever the streams are initialized. This is not likely to be the cause.

@jart do the backtraces require the .com.dbg to be present? Perhaps this missing error log is related to that.

jart commented 2 years ago

If you're using WSL then please confirm that this same issue happens when binfmt_misc is disabled (sudo sh -c 'echo -1 >/proc/sys/fs/binfmt_misc/status') just so we know that it isn't accidentally running in the WIN32 environment.

Keithcat1 commented 2 years ago

@jart I tried that but it didn't seem to change anything, still got a MemoryError both times.

print("Hi!", file = sys.stderr)

also worked fine.

ahgamut commented 2 years ago

Changed the name of the issue because we have 4 ports of Python to Cosmopolitan Libc:

stefnotch commented 1 year ago

@ahgamut That sounds amazing! Would it be possible for you to maybe provide a binary of one or two of them, just to make it easier for people to check it out and get excited?

(On a mostly unrelated note: The Discord link in https://ahgamut.github.io/2021/07/13/ape-python/ seems to have expired)

jart commented 1 year ago

I support @ahgamut distributing Actually Portable Python binaries on his blog. We're already doing release binaries for Actually Portable Perl. Binary releases are hard to pull off gracefully and I think @G4Vi did a great job with that.

@stefnotch if you want an Actually Portable Python binary to hold you over in the meantime, there's a link to a python.com binary in this blog post http://justine.lol/ftrace/ which you may download. It's an authentic build of Cosmopolitan's Python 3.6 under third party.

We do have a Discord and anyone reading is welcome to join: https://discord.gg/vFdkMdQN Please note this link expires in seven days. You can email jtunney@gmail.com if you need another one.

ahgamut commented 1 year ago

Actually Portable Python (CPython 3.11.4) binaries are available here: https://github.com/ahgamut/superconfigure/releases/tag/z0.0.3

Keithcat1 commented 1 year ago

They do seem to work on Windows, but you mgiht have to run them from the command line and not explorer. Also it erases the entire line every time I press backspace or plays the system sound for an inpalid keypress, I think it's called if there's nothing to delete.

ingenieroariel commented 1 year ago

I was able to reproduce compiling python.com on my machine from the superconfigure repo.

@ahgamut Could this issue be closed now?

ahgamut commented 1 year ago

Very well, closing. we can re-open if there are any new major issues with building CPython.

If anyone wants to try out a CPython3.11 Actually Portable Executable, you can download one from here: https://github.com/ahgamut/superconfigure/releases/tag/z0.0.24

EirikJaccheri commented 9 months ago

I am trying to compile python.com using the cpython cosmo_py311 branch:

https://github.com/ahgamut/cpython/tree/cosmo_py311

After following the instructions and running ./superconfigure i get the following error message:

checking whether we are cross compiling... configure: error: in /home/eirik/code_dir/cpython': configure: error: cannot run C compiled programs. If you meant to cross compile, use--host'. See `config.log' for more details

I also attach the config.log:

config.log

Do you know what might be causing this issue?

PS:

gcc --version returns

gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ahgamut commented 9 months ago

@EirikJaccheri I would say the cosmo_py311 is outdated at this point, due to all the improvements with Cosmopolitan Libc (most notably the cosmocc toolchain that uses a patched gcc-11 binary, and the apelink to produce fat binaries).

If you'd like to build CPython3.11 with Cosmopolitan Libc from source, I'd recommend trying out my superconfigure repo https://github.com/ahgamut/superconfigure. If you just want a python binary that's built with Cosmopolitan Libc, you can get it from the releases of that repo. If you're trying to build some specific Python packages, let me know what you have in mind.

EirikJaccheri commented 9 months ago

Hi, Thank you for your quick response:-)

The reason why i tried to use the cosmo_py311 is that there seemed to be support to include C libraries in python.com (specifically i would like to include numpy, clickhouse_connect, pandas, datetime, toml, sys and time)

In the superconfigure repo i got the impression that one could only include pure python libraries. Am i wrong? Is it possible to add these libraries? Eirik

ahgamut commented 9 months ago

the builds in superconfigure also provide C extensions (notably markupsafe and PyYAML). Someone experienced with setuptools/pip internals could do something wonderful at this stage. I'm trying to figure out a nice way to package numpy, I'll post another build on superconfigure once I figure it out.

Keithcat1 commented 9 months ago

Now that Cosmopolitan supports dlopen, can Ctypes be made to work? Mostly curious.

jart commented 9 months ago

ctypes is something I could imagine working.

EirikJaccheri commented 9 months ago

Hi again,

I figured out that numpy is not a dependency of clickhouse_connect. To get clickhouse_connect to work i only need two libraries which use C-extensions: zstandard and lz4 (https://pypi.org/project/zstandard/#description and https://pypi.org/project/lz4/#files).

@ahgamut Is it possible to build these packages using superconfigure? If so, how?

Eirik

ahgamut commented 9 months ago

ok, seems like it it can be done, with the following steps:

  1. write a build script for https://github.com/lz4/lz4 similar to xz or gzip in https://github.com/ahgamut/superconfigure/tree/main/compress
  2. copy https://github.com/indygreg/python-zstandard/tree/main/c-ext into the CPython source tree, and write a Modules/Setup recipe for it, similar to yaml
  3. build CPython via superconfigure
rupurt commented 9 months ago

@ahgamut superconfigure looks awesome! Thank you for the hard work.

Do you have any plans to add a python single executable cross compiler? Something like pyinstaller or nuitka?

ahgamut commented 9 months ago

I'm pretty happy building python via superconfigure for now -- using cosmocc as my cross-compiler and the scripts in superconfigure get me to a single python-executable for my uses.

rupurt commented 9 months ago

Interesting. Are you saying you can already make a single executable of the python app + the python runtime with cosmocc?

If so do you mind pointing me to the piece of code that does it and I can try myself?