Closed frank-zsy closed 1 year ago
This issue has not been replied for 24 hours, please pay attention to this issue: @gymgym1212 @xiaoya-yaya @xgdyp
I will implement this recently, the new data will contain 2021-10-raw
field to store the original data, and new 2021-10
will be generated by 2021-08 to 2021-12 as above.
I think this will effect Hypercrx too. @tyn1998
The data is ready now, I can upload data after Hypercrx fit the format, may also support other fields like 2022
, 2022-Q2
and all
too.
"recently" = 20 mins 🤣
I think this will effect Hypercrx too. @tyn1998
Yes.
The data is ready now, I can upload data after Hypercrx fit the format
The procedure is:
, may also support other fields like 2022, 2022-Q2 and all too.
Will these new data fields occur in the existing metrics? Or they will just be put into following new metrics? For the second case, I think we can do this when we actually present them.
I think @zhicheng-ning may also be informed since data service that supports serveral DataEase screens is also a consumer of OpenDigger data.
Will these new data fields occur in https://github.com/hypertrons/hypertrons-crx/issues/515#issue-1444862182? Or they will just be put into following new metrics? For the second case, I think we can do this when we actually present them.
I am not quite sure about this one, for statistical metrics, quarterly and yearly data can be calculated by monthly data, so actually we only need to add new data fields to the metrics that can not be simply added. Actually there is a metric that fit the rule which is participants
.
I think @zhicheng-ning may also be informed since data service that supports several DataEase screens is also a consumer of OpenDigger data.
Yes, @zhicheng-ning do you think this may effect DataEase dashboards?
0.15 Value_20218 + 0.35 Value_20219 + 0.35 Value_202211 + 0.15 Value_202212
Hi, I want to know why 0.35 * Value_202211 + 0.15 * Value_202212
is here, not 0.35 * Value_202111 + 0.15 * Value_202112
Hi, I want to know why
0.35 * Value_202211 + 0.15 * Value_202212
is here, not0.35 * Value_202111 + 0.15 * Value_202112
I can not find out what is the difference here.
do you think this may effect DataEase dashboards?
Actually I'm working on od-api which is a data transform service. I think the change in the upstream data format has little impact on me, but as there are more and more downstream projects in the future, I suggest that the upstream data format is as stable as possible.
I can not find out what is the difference here.
Value_202211 -> Value_202111
@zhicheng-ning That's is my mistake and a typo, I mean 202108 - 202112.
Actually I'm working on od-api which is a data transform service. I think the change in the upstream data format has little impact on me, but as there are more and more downstream projects in the future, I suggest that the upstream data format is as stable as possible.
Agreed, I think we can make the APIs format contains as much data as we have and then we will not change them later, as this is still the early age of OpenDigger data export process, format changes maybe inevitable.
Actually there is a metric that fit the rule which is
participants
.
I got it. So the new data fields will occur in some of the existing metrics and changes in Hypercrx are required.
Hi, @frank-zsy, when was data with raw
exported and uploaded to OSS?
Hypercrx has not been ready for the new yyyy-mm-raw
field so charts with an extra yyyy-mm-raw
are broken now:
I will fix it right now, with https://github.com/hypertrons/hypertrons-crx/issues/577 handled as well.
As GHArchive was offline in 2021.10 for about half a month, the statistical data in 2021.10 are all about 50% of 2021.9 and 2021.11.
So when we upload data to OSS, we can do a little trick to fix the problem. I think we can use like
0.15 * Value_20218 + 0.35 * Value_20219 + 0.35 * Value_202211 + 0.15 * Value_202212
to estimate the actually value of 2021.10, and still we should keep a field202110_original
to store the real value.