apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
https://devlake.apache.org/
Apache License 2.0
2.61k stars 530 forks source link

[Bug][Config-UI] subtask calculateChangeLeadTime unable to handle NULL values #8217

Open mdapke-insight opened 1 week ago

mdapke-insight commented 1 week ago

Search before asking

What happened

The subtask for my devlake config-UI pipeline fails with the error : level=error msg=" [pipeline service] [pipeline #5907] [task #47939] subtask calculateChangeLeadTime ended unexpectedly\n\tWraps: (2) error adding result to batch (500)\n\tWraps: (3) Error 1292 (22007): Incorrect datetime value: '0000-00-00' for column 'first_commit_authored_date' at row 43 (500)\n\tWraps: (4) Error 1292 (22007): Incorrect datetime value: '0000-00-00' for column 'first_commit_authored_date' at row 43\n\tError types: (1) hintdetail.withDetail (2) hintdetail.withDetail (3) hintdetail.withDetail (4) mysql.MySQLError"

What do you expect to happen

Successful pipeline run

How to reproduce

A new column named first_commit_authored_date is introduced in the latest devlake versions (I am using v1.0.2-beta5 ) and this column was not in V0.21 and after upgrade, I am getting the above mentioned error . The table containing this column is : project_pr_metrics

NULL values in this column are not being correctly handled by devlake.

Anything else

No response

Version

v1.0.2-beta5

Are you willing to submit PR?

Code of Conduct

dosubot[bot] commented 1 week ago

Hey @mdapke-insight! I'm here to help you with any bugs, questions, or contributions you have for the project. Let's tackle this issue together!

The error you're encountering with the calculateChangeLeadTime subtask seems to be related to handling incorrect datetime values, specifically '0000-00-00', in the first_commit_authored_date column. This issue arises because the subtask does not handle NULL or default datetime values properly, which can lead to errors during processing.

To address this issue, you might consider implementing a check in the code to handle such default or NULL datetime values before they are processed. This could involve:

  1. Modifying the logic in the calculateChangeLeadTime subtask to check for and handle '0000-00-00' or NULL values in the first_commit_authored_date column.
  2. Ensuring that any datetime processing functions gracefully handle these cases, possibly by skipping the calculation or setting a default value.

Unfortunately, I couldn't find specific details on how NULL values are currently handled in the calculateChangeLeadTime subtask or any existing error handling mechanisms for incorrect datetime values in the repository. You might need to review the code and add the necessary checks to prevent this error from occurring.

To continue talking to Dosu, mention @dosu.

mdapke-insight commented 1 week ago

@klesh Can you help with this?

klesh commented 6 days ago

Will take a look.