Closed eurunuela closed 4 years ago
Thanks for doing this. Looks great.
Tabs are no longer introduced with \t but rather with 4 spaces. The \t was introducing extra tabs which probably were the source of the issues we had with pandas. Now, only issue titles with quotations are a problem to generate the tsv, as they break into different rows.
Does using a CSV
solve the line breaking issue? If not, and if we stick to the TSV
, is adding 4 blank spaces to data/projects.tsv
enough for the Python script to work or else will it be necessary to e.g. strip the quotation marks, i.e. how will gh issue view
output labels: will different labels be separated by commas, etc.? Can you please have a look at that?
Does using a
CSV
solve the line breaking issue? If not, and if we stick to theTSV
, is adding 4 blank spaces todata/projects.tsv
enough for the Python script to work or else will it be necessary to e.g. strip the quotation marks, i.e. how willgh issue view
output labels: will different labels be separated by commas, etc.? Can you please have a look at that?
I just checked again. The issue with the double tabs is solved but something is wrong with the header row. All three values are stored in the first column instead of in three columns. Given that we already know what the structure of the tsv is, we can tell pandas not to read the first row:
my_tsv = pd.read_csv('my_tsv.tsv', sep='\t', header=None, skiprows=0)
I checked and this line of code works.
Have not tested, but I assume it provides the output in the expected format.
This PR adds:
-h
and the--help
flags are passed.${EVENT}
issues are checked first and only those with thestatus:web_ready
,status:published
andproject
labels are saved into the output file.\t
but rather with 4 spaces. The\t
was introducing extra tabs which probably were the source of the issues we had with pandas. Now, only issue titles with quotations are a problem to generate the tsv, as they break into different rows. I haven't seen any other incorrect behavior with this approach.