python / cpython

The Python programming language
https://www.python.org
Other
62.17k stars 29.88k forks source link

ctypes cdll list exported functions #113388

Open Zibri opened 8 months ago

Zibri commented 8 months ago

Feature or enhancement

Proposal:

after loading a library with cdll.LoadLibrary it would be great (and much requested) to have the exported function list available.. for example:

mylib=cdll.LoadLibrary("libname")
dir(mylib) or mylib.exports

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

1MLightyears commented 8 months ago

This request is understandable, but neither of the call you provided is appropriate, to my understanding.

  1. As mylib is an Python object, NOT a real dll fd, the dir() should NOT return symbols in "libname", but the methods and properties of mylib.
  2. mylib.exports seems better, but what if "libname" contains a symbol that is exactly "exports"?

I vote for adding a new method for cdll objects, like describe() or exports(), etc.

ronaldoussoren commented 8 months ago

Implementing this might be hard regardless of the API, on Unix-y platforms ctypes uses dlopen to open shared libraries en dlsym to look up symbols. AFAIK the dlopen APIs don't have a way to iterate over available symbols.

1MLightyears commented 8 months ago

I'm not quite familiar with GNU low-level platform libraries so I have to google or search for stackoverflow for information...

Implementing this might be hard regardless of the API, on Unix-y platforms ctypes uses dlopen to open shared libraries en dlsym to look up symbols. AFAIK the dlopen APIs don't have a way to iterate over available symbols.

As far as I learnt, dlsym uses what is usually called "reflex" in other languages to find the symbol. I found its implementation here, and it pointed out that we can implement a version of it by simply iterate the symbol table. Also, a more formal and practical approach is found here. By these information I think it's not so hard to compose a version of our own.

The only two problems are, firstly, I don't know if other platforms have similar functions; secondly, nm/readelf/objdump should more or less enough for a human-readable symbol table, which means such a function is insignificant.

Zibri commented 8 months ago

Well.. "nm -D LIBRARY.so" works. so by implementing the same mechanism it should be more than possible.

Zibri commented 8 months ago

@1MLightyears the api is not the issue... that is up to the maintainers to chose.

Zibri commented 8 months ago

The relevant code can be found in ptrace source for ELF files.

1MLightyears commented 8 months ago

@Zibri The problem is the same mechanism could be HUGE. Here you can find an implementation of readelf, which has 16,704 lines. Even if the maintainers may rewrite a much smaller version of their own, it could still be a lot of lines, and it's only for ELF-we haven't mentioned about Windows yet.

My point is do we really need them, while nm/objdump/readelf are already handy enough.

Zibri commented 8 months ago

@1MLightyears I am just saying it would be handy.

You can find the source here for example: https://github.com/hello2mao/XHook/blob/master/ref/jni/hijack_ref/hijack.c

function: static symtab_t load_symtab(char *filename)

1MLightyears commented 8 months ago

Okay...I have to commit this

You can find the source here for example: https://github.com/hello2mao/XHook/blob/master/ref/jni/hijack_ref/hijack.c

makes sense... Maybe it's not that complex, and the only problem is whether it's worth doing this.

Zibri commented 8 months ago

considering how many people are asking that (on stackoverflow for example) I'd say it's worth it.

ronaldoussoren commented 8 months ago

Well.. "nm -D LIBRARY.so" works. so by implementing the same mechanism it should be more than possible.

Sure, but that could be a lot of work and complicated code given the number of platforms we support. The code not only has to be written, but also needs to be maintained over the years.

That makes it unlikely that we'll do this in CPython if the dlopen APIs don't support this (as well as the corresponding APIs used on Windows).

This request is understandable, but neither of the call you provided is appropriate, to my understanding.

  1. As mylib is an Python object, NOT a real dll fd, the dir() should NOT return symbols in "libname", but the methods and properties of mylib.
  2. mylib.exports seems better, but what if "libname" contains a symbol that is exactly "exports"?

I vote for adding a new method for cdll objects, like describe() or exports(), etc.

Functions in a shared library can be accessed as attributes on a CDLL instance. That makes dir() the natural API to list available functions in the CDLL.