opensource-observer / oso

Measuring the impact of open source software
https://opensource.observer
Apache License 2.0
74 stars 16 forks source link

Timeseries metrics seems to have conflicting data #2377

Closed ryscheng closed 1 month ago

ryscheng commented 1 month ago

What is it?

From Elliot

When running a query I get different amounts returned for the same metricId, ProjectId and sampleDate. Not sure if my expectations are wrong or if this is just because the data isn't quite ready.
JavaScript

1: {
        amount: 1,
        metricId: 'eziTctwFeZugH0ocp4pjj9cXQFTRVCm6KgJ+ozZCmmQ=', 
        projectId: 'cLZx_e_EZ-UuWqfOXvqTyUZfecyfWGmbQnA2lRMKrR8=', 
        sampleDate: '2024-04-23', 
        unit: null
    },
2: {
        amount: 115, 
        metricId: 'eziTctwFeZugH0ocp4pjj9cXQFTRVCm6KgJ+ozZCmmQ=', 
        projectId: 'cLZx_e_EZ-UuWqfOXvqTyUZfecyfWGmbQnA2lRMKrR8=', 
        sampleDate: '2024-04-23', 
        unit: null
    }

Found with this query:

oso_timeseriesMetricsByProjectV0(where: $where) {
      amount
      metricId
      projectId
      sampleDate
      unit
    }

with a where clause that looks like this:

where: {
      projectId: {
        _eq: projectId,
      },
      metricId: {
        _in: metricIds,
      },
      sampleDate: {
        _gt: startDate,
        _lte: endDate,
      },
    },

I would expect that there would only be one entry per unique metricId, projectId and sampleDate combinations.

Thanks for helping me understand why this is the case.

ravenac95 commented 1 month ago

Technically we fix the issue in #2400 but this will take a little while to get reflected on the api as the deployment of our metrics is not yet automatic (we are working on this actively).

ravenac95 commented 1 month ago

Thanks @escottalexander for this issue!