brainhackorg / brainmatch

Event project requirement-contributor request matching toolkit
2 stars 3 forks source link

Add help function, restrict gh call to certain labels and change the way tabs are introduced into tsv #7

Closed eurunuela closed 4 years ago

eurunuela commented 4 years ago

This PR adds:

jhlegarreta commented 4 years ago

Thanks for doing this. Looks great.

Tabs are no longer introduced with \t but rather with 4 spaces. The \t was introducing extra tabs which probably were the source of the issues we had with pandas. Now, only issue titles with quotations are a problem to generate the tsv, as they break into different rows.

Does using a CSV solve the line breaking issue? If not, and if we stick to the TSV, is adding 4 blank spaces to data/projects.tsv enough for the Python script to work or else will it be necessary to e.g. strip the quotation marks, i.e. how will gh issue view output labels: will different labels be separated by commas, etc.? Can you please have a look at that?

eurunuela commented 4 years ago

Does using a CSV solve the line breaking issue? If not, and if we stick to the TSV, is adding 4 blank spaces to data/projects.tsv enough for the Python script to work or else will it be necessary to e.g. strip the quotation marks, i.e. how will gh issue view output labels: will different labels be separated by commas, etc.? Can you please have a look at that?

I just checked again. The issue with the double tabs is solved but something is wrong with the header row. All three values are stored in the first column instead of in three columns. Given that we already know what the structure of the tsv is, we can tell pandas not to read the first row:

my_tsv = pd.read_csv('my_tsv.tsv', sep='\t', header=None, skiprows=0)

I checked and this line of code works.

jhlegarreta commented 4 years ago

Have not tested, but I assume it provides the output in the expected format.