datalad / datalad-helpme

Automated issue submission for datalad via helpme
0 stars 0 forks source link

Adding GitHub Workflow Parser #7

Open vsoch opened 4 years ago

vsoch commented 4 years ago

This is a first shot at adding the parser as a GitHub action to, when an issue is submit:

This should serve two fold - to both help the user, and keep a little database of issues reported. I suspect we will want to get a base merged, and then tweak details once the datalad PR is merged and we can adjust.

yarikoptic commented 4 years ago

Md5 of full traceback might be too rigid. I would have made it a tripple hierarchy:

Such levels could allow for matching similar even if not identical issues on client side (eg full repo could be cloned and updated in the cache). GitHub action could be used to make records of new issues (which would already have that composite fingerprint in them already).. although there might be benefits from collecting additional traceback and wtf details for already existing issues, I am afraid it might be too much chatter if we are to monitor this collection of issues

vsoch commented 4 years ago

Okay so just to make sure I have it right, you would do (these are just randomly derived values so we can see what it looks like)

So you are proposing it would look like:

RuntimeError-<md5-functions>-<md5-datalad>

and then store there detailed info per issue with full traceback etc which could differ (line numbers shift between changes, paths differ in messages etc). Additional matching could be done on that narrowed down set.

If the traceback is part of the md5, and it's included in the issue, we would definitely be storing it. For the line numbers, I think that's probably overkill for the points that you mentioned.

My 0.02 for the above - I think the specific dependency lists and functions list might be too detailed for grouping errors. If we have an exception name, and then md5 based on the traceback, I think that could be enough for a human to browse, and to match issues that belong together. On the other hand, you are thinking that you would want to search based on md5 of just a functions list, or just a hash of functions? I have mixed feelings about this, because I don't think I fully understand what a functions list is. My instinct is that we should start with a simpler (less detailed) approach and only dive into more detail if we find it doesn't work well (meaning that two exceptions are labeled as the same but are very different to resolve / address, or we need to search for something and find that we cannot).

vsoch commented 4 years ago

okay actually I think I figured it out re: the lists:

In [70]: datald                                                                                                                                                                                   
Out[70]: ['datalad', 'datalad/cmdline/main.py']

In [71]: others                                                                                                                                                                                   
Out[71]: 
['site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/async_helpers.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/embed.py',
 'site-packages/IPython/terminal/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/async_helpers.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py',
 'site-packages/IPython/core/interactiveshell.py']
vsoch commented 4 years ago

okay just updated the script here to use the updated (more specific) hash.