theChaosCoder / vsdb

VSDB - A Database for VapourSynth Scripts & Plugins
https://forum.doom9.org/showthread.php?t=175702
9 stars 0 forks source link

Check for new script functions automatically #4

Open theChaosCoder opened 6 years ago

theChaosCoder commented 6 years ago
OrangeChannel commented 4 years ago

Most of this can be done with the inspect and pydoc modules, and some minor rewrites to their internal methods. If you're still interested in this, what would you need a parser to return so that you could compare this with the current database (as I don't really know how the db stuff is stored/edited (json?)) and how would we update the db accordingly?

Regarding the ignore list, assuming the pyscripts only define functions, and not class methods, an __all__ is officially the solution for this, but by default pydoc and such disregard 'private' functions starting with one or two _'s.

There is also some more stuff that could be done here, including getting function synopses + their full docstrings, their parameters + typehints, and we could also grab clip formatting information from the docstrings assuming they have them in some standardized format (I propose something similar to the PEP8/reST docstring formatting):

def func(frame: int, clip: vs.VideoNode = None) -> vs.VideoNode:
"""Function synopsis in one sentence.

Additional information about the function goes here.
And here.

:param frame: <int> description

:param clip: <vapoursynth.VideoNode> input clip
    :bit depth: 8, 16, 32
    :color family: YUV RGB GRAY
    :float precision: H, S
    :sample type: Float, Interger
    :subsampling: 444,410,420,422

:returns: <vapoursynth.VideoNode>
"""

And I also propose limiting the values for each category to these lists (making some standard here would help fix mistakes like this) which were grabbed from print(vs.core.std.BlankClip(format=vs.GRAYH).format).

color_families = ['COMPATBGR32', 'COMPATYUY2', 'GRAY', 'RGB', 'YCOCG', 'YUV']
# Subsampling W/H for following strings: 410=(2/2), 411=(2/0), 420=(1/1),
#                                        422=(1/0), 440=(0/1), 444=(0/0).
YUV_subsampling = ['410', '411', '420', '422', '440', '444']
# Always represents `Bits Per Sample`, so RGB24 would be `8`.
bits_per_sample = ['8', '9', '10', '12', '14', '16']
sample_type = ['Float', 'Interger']
# 2/4 represent `Bytes Per Sample`, 16/32 represent `Bits Per Sample`
float_precision = ['H', 'S', '2', '16', '4', '32']

others = ['ANY', 'NONE', 'UNKNOWN']
theChaosCoder commented 4 years ago

db stuff = mysql db

pydoc is really nice! I also tried inspect, dir and just parsing def() lines but pydoc seems to be a very good solution which does most of the work.

I guess I could also include the docstrings from pydoc ( also the very long doc strings?, see havsfunc). Do we really need typehints? One can always check the linked scripts. Parameters were always on the todo list.

grab clip formatting information from the docstrings assuming they have them in some standardized format

This would be nice but I doubt enough people would do this. We often lack even basic information.

And I also propose limiting the values for each category to these lists (making some standard here would help fix mistakes like this) which were grabbed from

The plan was limit the allowed values after some evaluation time. I will add some restrictions/checks.

p.s. Maybe GARY is the new video snail standard format in case you didn't know ;D

OrangeChannel commented 4 years ago

Here's a short function to grab function names from a module given it's name (i.e. assuming you have lvsfunc.py somewhere in sys.path):

import pydoc, inspect
def docmodule(module_name: str):
    module = __import__(module_name)
    mod_obj, name = pydoc.resolve(module)
    all = getattr(mod_obj, '__all__', None)
    funcs = []
    func_objects = []
    for funcname, func_obj in inspect.getmembers(mod_obj, inspect.isroutine):
        if pydoc.visiblename(funcname, all, mod_obj) and name == func_obj.__module__:
            if func_obj not in func_objects:
                func_objects.append(func_obj)
                funcs.append((funcname, func_obj))

    return funcs
for name, func in docmodule('lvsfunc'):
    print(name)

comp
cond_desc
deblend
fix_cr_tint
limit_dark
...

It avoids listing duplicate functions (i.e. shorthands like comp = compare) and doesn't show methods from other modules like inspect.getmembers(object, inspect.isroutine) usually would.

From this, you can also get parameters from the functions as well:

for name, func in docmodule('vscompare'):
    try:
        inspect.signature(func).parameters
        print(name + ':')
        print(list(inspect.signature(func).parameters))
    except TypeError:
        pass

comp:
['frames', 'rand', 'slicing', 'slices', 'full', 'label', 'label_size', 'label_alignment', 'stack_type', 'in_clips']
prep:
['clips', 'w', 'h', 'dith', 'yuv444', 'static']
save:
['frames', 'rand', 'folder', 'zoom', 'clips']

Grabbing methods from classes is a bit more ugly, but it's very doable. This could be cleaned up somewhat, but let me know if you could do some database magic given what is returned here.

EDIT: docstrings for functions is also possible with print(docmodule('lvsfunc')[0][1].__doc__)

import lvsfunc
print(pydoc.splitdoc(pydoc.getdoc(lvsfunc.source))[0])

can be used to get the one-line synopsis of the function.

theChaosCoder commented 4 years ago

Thx, I will try to add some of the "new data" next week or so.