juancarlospaco / faster-than-requests

Faster requests on Python 3
https://gist.github.com/juancarlospaco/37da34ed13a609663f55f4466c4dbc3e
MIT License
1.11k stars 90 forks source link

Segmentation Fault #164

Closed primatekid closed 3 years ago

primatekid commented 3 years ago

I'm not sure what exactly is generating this behaviour, but I've noticed that some examples are causing "Segmentation fault" error.

For example, when running (as seen in the docs)

import faster_than_requests as requests
requests.get2str2(["https://httpbin.org/uuid", "https://httpbin.org/uuid"])

gives "Segmentation fault".

However, after reading the code and inspecting other examples, the following code:

import faster_than_requests
faster_than_requests.init_client()
print(
  faster_than_requests.get2str2(
    list_of_urls = ["https://httpbin.org/uuid", "https://httpbin.org/uuid"]
  )
)
faster_than_requests.close_client()

works perfectly.

Seems that init_client() made a huge difference. So, wouldn't be appropriate to describe this approach in the docs too?

juancarlospaco commented 3 years ago

https://github.com/juancarlospaco/faster-than-requests#init_client

FlogramMatt commented 3 years ago

I'm getting segmentation faults too response = fast_requests.post(url=self._url, http_headers=self._fast_headers, body=payload)

Interestingly always happens on the same request but it's a different request depending upon how I call it.

Note: request numbers may represent multiple calls using faster_than_requests but one request in my program. Every time it runs it grabs the exact same data.

Always happens on request number '6' when this is all in a loop: fast_requests.init_client() response = fast_requests.post(url=self._url, http_headers=self._fast_headers, body=payload) fast_requests.close_client()

Always happens on request number 608 when init_client is called once, then post only in a loop

Always happens on request number 922 when init_client is not called and post is called in a loop

Going to try updating to the newest version, but I suspect that won't fix it given this ticket is still open.

This might be of interest: https://docs.w3cub.com/nim/segfaults

I would be happy to run some code that prints and flushes debugging messages after every line in post to pinpoint exactly where the segfault is happening if it'd help given how repeatable this is for me.

FlogramMatt commented 3 years ago

Unfortunately, it seems to be an issue in the httpclient library (if that is the one used for this line of code: headers = newHttpHeaders(http_headers) ), not your code :(

This ticket might have a workaround where you add an extra flag when compiling via nim, though the stacktrace looks different: https://github.com/nim-lang/Nim/issues/9016

More info on this, ran the program using gdb and it generated a stack trace that should help debug this, the alloc that ultimately threw it comes first:

Program received signal SIGSEGV, Segmentation fault. 0x00007fffeaca7c2a in rawAlloc__mE4QEVyMvGRVliDWDngZCQ () from /home/centos/mczarnek/parser/src/faster_than_requests.so Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-260.el7_6.6.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-50.el7.x86_64 libcom_err-1.42.9-19.el7.x86_64 libgcc-4.8.5-44.el7.x86_64 libselinux-2.5-15.el7.x86_64 nss-softokn-freebl-3.44.0-8.el7_7.x86_64 openssl-libs-1.0.2k-21.el7_9.x86_64 pcre-8.32-17.el7.x86_64 sqlite-3.7.17-8.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64 (gdb) bt

0 0x00007fffeaca7c2a in rawAlloc__mE4QEVyMvGRVliDWDngZCQ () from /home/centos/mczarnek/parser/src/faster_than_requests.so

1 0x00007fffeacad7f7 in newObj () from /home/centos/mczarnek/parser/src/faster_than_requests.so

2 0x00007fffeacf366b in newHttpHeaders__OgMp9bWC8iYHI7SRLkYXelQ () from /home/centos/mczarnek/parser/src/faster_than_requests.so

3 0x00007fffead3b2bf in post__vETN2ENvwu4mb9c3CeLch9bw () from /home/centos/mczarnek/parser/src/faster_than_requests.so

4 0x00007fffead3b948 in noinline__VZ8sfrffg8NcNsSos89czMg_5 () from /home/centos/mczarnek/parser/src/faster_than_requests.so

5 0x00007fffead36109 in postPy_wrapper__3K9bZLUpxf6UpYGhVAN3BTQ () from /home/centos/mczarnek/parser/src/faster_than_requests.so

6 0x00000000005efd4b in cfunction_call (func=func@entry=0x7fffeb2559a0, args=args@entry=0x7ffff7fae040, kwargs=kwargs@entry=0x7fffba48e840)

at Objects/methodobject.c:539

7 0x000000000043606b in _PyObject_MakeTpCall (tstate=0x99f800, callable=callable@entry=0x7fffeb2559a0, args=args@entry=0xc5bef8,

nargs=nargs@entry=0, keywords=keywords@entry=0x7fffeb249200) at Objects/call.c:191

8 0x00000000004292ba in _PyObject_VectorcallTstate (kwnames=0x7fffeb249200, nargsf=9223372036854775808, args=0xc5bef8, callable=0x7fffeb2559a0,

tstate=<optimized out>) at ./Include/cpython/abstract.h:116

9 PyObject_Vectorcall (kwnames=0x7fffeb249200, nargsf=9223372036854775808, args=0xc5bef8, callable=0x7fffeb2559a0)

at ./Include/cpython/abstract.h:127

10 call_function (kwnames=0x7fffeb249200, oparg=, pp_stack=, tstate=0x99f800) at Python/ceval.c:5072

11 _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:3535

12 0x00000000004eaf56 in _PyEval_EvalFrame (throwflag=0, f=0xc5bd50, tstate=0x99f800) at ./Include/internal/pycore_ceval.h:40

13 _PyEval_EvalCode (tstate=0x99f800, _co=0x7fffeb2483a0, globals=, locals=locals@entry=0x0, args=args@entry=0x7fffba4768c8,

argcount=3, kwnames=0x0, kwargs=kwargs@entry=0x7fffba4768e0, kwcount=0, kwstep=kwstep@entry=1, defs=0x0, defcount=defcount@entry=0,
kwdefs=kwdefs@entry=0x0, closure=0x0, name=<optimized out>, qualname=qualname@entry=0x7fffeb249430) at Python/ceval.c:4327

14 0x0000000000437444 in _PyFunction_Vectorcall (func=, stack=0x7fffba4768c8, nargsf=, kwnames=)

at Objects/call.c:396

15 0x000000000042402e in _PyObject_VectorcallTstate (kwnames=, nargsf=, args=,

callable=<optimized out>, tstate=<optimized out>) at ./Include/cpython/abstract.h:118

16 PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=)

at ./Include/cpython/abstract.h:127

17 trace_call_function (kwnames=, nargs=, args=, func=, tstate=)

at Python/ceval.c:5053

18 call_function (kwnames=0x0, oparg=, pp_stack=, tstate=0x99f800) at Python/ceval.c:5069

19 _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:3504

20 0x000000000041ea38 in _PyEval_EvalFrame (throwflag=0, f=0x7fffba476740, tstate=0x99f800) at ./Include/internal/pycore_ceval.h:40

21 function_code_fastcall (tstate=0x99f800, co=, args=, nargs=2, globals=) at Objects/call.c:330

22 0x000000000042402e in _PyObject_VectorcallTstate (kwnames=, nargsf=, args=,

juancarlospaco commented 3 years ago

You should compile using ARC, instead of a garbage collector, using --gc:arc. ARC uses a shared Heap, instead of 1 Heap per Thread. ARC is also compile-time memory management.

FlogramMatt commented 3 years ago

I can give that a try, no risk of cyclic references that won't be broken? Is that how you normally compile it? That flag isn't set in the example you give for compiling for Windows.

Being pulled between projects and asked to prioritize getting something else working while using the old slow requests library :( Probably will be a little bit before I get back to this. Might be a couple weeks.

juancarlospaco commented 3 years ago

ARC is kinda like Rusts memory management, it injects Destructors at compile-time.

You can use ORC, thats ARC + Cycle breaker. ARC/ORC is planned to become Default on next mayor release of the compiler.

You can set flags on a .nim.cfg file instead of passing all via command. Sorry but PIP/Pypi do not allow customization of flags.

p-i- commented 3 years ago

@juancarlospaco https://github.com/juancarlospaco/faster-than-requests/blob/master/examples/example.py is still broken

As @frankmoshe points out, it needs requests.init_client() before this line: https://github.com/juancarlospaco/faster-than-requests/blob/master/examples/example.py#L14

EVERY line after that line to the end of the file requires it.

Super grateful for the library btw 🙏

p-i- commented 3 years ago

Also this line is broken: https://github.com/juancarlospaco/faster-than-requests/blob/master/examples/example.py#L30

get2ndjson appears not to exist any more.

https://github.com/juancarlospaco/faster-than-requests/search?q=get2ndjson

p-i- commented 3 years ago

Is there any way to force FTR to recompile?

p-i- commented 3 years ago

I've tried with orc and arc, and I get segfault:

nim c \
    --opt:speed \
    --parallelBuild:0 \
    --threads:on \
    --app:lib \
    -d:release \
    -d:strip \
    -d:ssl \
    --gc:arc \
    --forceBuild:on \
    --os:MacOSX \
    --out:${root_dir}/__pycache__/faster_than_requests.so \
    ${root_dir}/faster_than_requests.nim

Any suggestions?

juancarlospaco commented 3 years ago

The example is now fixed, the function that does not exist is removed, docs already mention it.