Data quality of the usage data

github-copilot-resources / copilot-metrics-viewer

Tool to visualize the Copilot metrics provided via the Copilot Business Metrics API (current in public beta)

https://copilot-metrics-viewer-gthcc5cmd9ebf2ff.westeurope-01.azurewebsites.net/

MIT License

308 stars 155 forks source link

Data quality of the usage data #54

Open DevOps-zhuang opened 5 months ago

DevOps-zhuang commented 5 months ago

As we are adding more features for the github copilot usage API, we found the data returned by the copilot API are not always complete. so I summary the below data missing scenarios, and are considerring exposing such missing in the copilot view tool. is it needed?

Data missing scenarios:

1) No data for some day, even it is within past 28 days.
2) total_lines_suggested are zero, even there are datas. 3) the breakdown is empty.

martedesco commented 4 months ago

@DevOps-zhuang , thanks for sharing these insights. I think it is indeed important to consider these gaps in the visualization tool. I will keep a note on that.

Regarding the data quality, it is important to highlight that the API is currently in beta, and there may be some gaps as we continue to refine the service towards general availability. If these gaps are substantial and impact the adoption analysis, I encourage you to reach out to your account manager, who can connect with the appropriate teams within GitHub to ensure our team can follow up on these issues. 🙏🏼 cc: @djopatrny

djopatrny commented 4 months ago

Thanks for your feedback @DevOps-zhuang - when was the last time you pulled the data? As @martedesco we are making incremental improvements to the data quality during the beta while we prepare for the GA release.

DevOps-zhuang commented 4 months ago

I understand this API is in beta, so we hope it can notice such issues and fix it :-) and to make it more visible. I am adding a feature to check the data quality in the copilot view tools. it will check the data for unconsistent data/breakdowns is empty/total suggestion count is 0.

DevOps-zhuang commented 4 months ago

Thanks for your feedback @DevOps-zhuang - when was the last time you pulled the data? As @martedesco we are making incremental improvements to the data quality during the beta while we prepare for the GA release.

I checked it each day, and I am also adding a featuer to save the persistent data. so if the latest fetch can get more data, it will write to the file to fix the gap. and will keep watching it, thanks!