Closed cptanalatriste closed 8 months ago
Carlos, have you verified that issue labels are present in the data? The Events API does not (unfortunately) surface all the issue metadata (see related discussion in https://github.com/igrigorik/gharchive.org/issues/183) so I wouldn't be surprised if this data is missing.
/cc @annafil
@cptanalatriste wanted to follow up here and echo what @igrigorik said in case you are still working on this issue -- I believe issue labels are not included in the /events API feed coming from GitHub (just webhooks), so I would not expect to see them in the GHArchive data set.
Hey!
I believe the information is there (at least some of it), however since it's nested, you must index into the field in order to get the actual values. I ran the following query
SELECT repo.name, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name') label1, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[1].name') label2, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[2].name') labels3 FROM [githubarchive:day.20150101] WHERE type IN ('IssuesEvent') AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'opened' LIMIT 50
and my result is attached.
Please let me know if this is incorrect or if I may have misunderstood the problem!
On Tue, 5/29/18, raghavsehtia notifications@github.com wrote:
Subject: Re: [igrigorik/gharchive.org] Labels per Issue (#184) To: "igrigorik/gharchive.org" gharchive.org@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Date: Tuesday, May 29, 2018, 5:56 PM
Hey!
I believe the information is there (at least some of it), however since it's nested, you must index into the field in order to get the actual values. I ran the following query
SELECT repo.name, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name') label1, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[1].name') label2, JSON_EXTRACT_SCALAR(payload, '$.issue.labels[2].name') labels3 FROM [githubarchive:day.20150101] WHERE type IN ('IssuesEvent') AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'opened' LIMIT 50
and my result is attached.
Please let me know if this is incorrect or if I may have misunderstood the problem!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
I'm trying to run some analysis tasks over the labels an issue gets attached. I started running this query on BigQuery:
And I obtained no results, even if I remove the action filter. Does this information get mined in the dataset?