How to use this tool as an API in Python code?

priv-kweihmann / multimetric

Calculate code metrics in various languages

zlib License

36 stars 13 forks source link

How to use this tool as an API in Python code? #34

Open zhimin-z opened 1 year ago

priv-kweihmann commented 1 year ago

Hi @zhimin-z,

currently that's not a supported use case - but it should be doable quite easily. What needs to be done is

move most of the processing code of __main__.py:main into a separate function, taking all of the arguments currently extracted from the https://github.com/priv-kweihmann/multimetric/blob/a1d3680bcd15794dcf50e6492d3fb70e2824e107/multimetric/__main__.py#L88 call
new main should just consist of the argparse call, the call to run the newly created function and the printing of the results

Once that is done you could just add from multimetric.__main__ import <newly added function> in your code and run as if you would run the same from CLI.

Maybe @aylusltd has the time to do that, but also feel free to provide the necessary patches - PRs highly welcome

In the meantime you could as well wrap the invocation of this tool via subprocess and just parse the output

json.loads(subprocess.check_output(['multimetric', <args go here>, *files], universal_newlines=True))

zhimin-z commented 1 year ago

If I have a dataframe consisting of code & text, how could I call multimetric to parse it? For example , my dataframe looks like this:

I do not want to delete the created time & closed time, but I want to feed the entire dataframe to multimetric.

priv-kweihmann commented 1 year ago

@zhimin-z the example doesn't look like code to me, but if it were code I would say, as I think the data from that table comes from some short of structure the following pseudo code could work

import tempfile
import subprocess
import json

for item in datastructure:
    with tempfile.TemporaryFile() as i:
        i.write(item['Challenge_body'])
        i.flush()
        i.seek(0)
        try:
            item['multimetric'] = json.loads(subprocess.check_output(['multimetric', i.name], universal_newlines=True))
        except:
            pass

The matching result to each row would then be part accessible as 'multmetric' for any kind of further processing. One caveat here, as the input file doesn't have any extension, you might need to pass the language to be used for the lexer manually (see the README for more details)

zhimin-z commented 1 year ago

@zhimin-z the example doesn't look like code to me, but if it were code I would say, as I think the data from that table comes from some short of structure the following pseudo code could work
import tempfile
import subprocess
import json

for item in datastructure:
    with tempfile.TemporaryFile() as i:
        i.write(item['Challenge_body'])
        i.flush()
        i.seek(0)
        try:
            item['multimetric'] = json.loads(subprocess.check_output(['multimetric', i.name], universal_newlines=True))
        except:
            pass
The matching result to each row would then be part accessible as 'multmetric' for any kind of further processing. One caveat here, as the input file doesn't have any extension, you might need to pass the language to be used for the lexer manually (see the README for more details)

Thanks, @priv-kweihmann! I wonder if multmetric can process code within text since I saw there is a metric called "code_comment_ratio", does it work for code within text as shown above?

priv-kweihmann commented 1 year ago

@zhimin-z out of the box, no, the tool can't do that right now. But in theory you could write your own lexer for that https://pygments.org/docs/lexerdevelopment/ - a lexer is needed for doing all the computation of the statistics