apache / incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
https://devlake.apache.org/
Apache License 2.0
2.57k stars 515 forks source link

[Bug][Refresh] Unable to delete data and collect again #8113

Open pamflores-clip opened 4 days ago

pamflores-clip commented 4 days ago

Search before asking

What happened

I have a record in the DB that is not correctly updated. In github, this ticket has an updated date of september:

Screenshot 2024-09-27 at 11 34 26 a m

and in the DB the updated date is in march:

Screenshot 2024-09-27 at 10 28 40 a m

What do you expect to happen

I was searching for a way to delete all of the data that is in a project without deleting the actual project

How to reproduce

I have already tried the Collect Data in Full Refresh Mode but still is not updated in DB

Anything else

We updated our devlake version in August

Version

v1.0.1-beta7@3f84bae

Are you willing to submit PR?

Code of Conduct

dosubot[bot] commented 4 days ago

Hello @pamflores-clip! I'm here to assist you with any bugs, questions, or contributions you may have. Let's work together to resolve your issue.

To delete all data in a project without deleting the actual project, you can use the "Collect Data in Full Refresh Mode." This mode will delete all existing data within the designated time range and re-collect it. Here are the steps:

  1. Go to the Config UI at http://localhost:4000.
  2. Navigate to the project you want to refresh.
  3. Set the synchronization policy to "Collect Data in Full Refresh Mode."
  4. Start the data collection process.

This should remove the outdated or incorrect data and re-collect it from the original tools [1].

If you have already tried this and the data is still not updated, there might be an issue with the specific version you are using (v1.0.1-beta7@3f84bae). Unfortunately, I couldn't find any known issues or bugs related to the 'Collect Data in Full Refresh Mode' for this version [2].

As a workaround, you can manually delete specific records from the database using the Delete method provided by the Dal interface. Here are the steps:

  1. Identify the Record: Determine the specific record you want to delete. This usually involves knowing the primary key or a unique identifier for the record.
  2. Create an Entity Instance: Create an instance of the entity that represents the record you want to delete. Populate it with the necessary identifying information.
  3. Call the Delete Method: Use the Delete method from the Dal interface to delete the record.

Example Code

// Assuming you have a User model and you want to delete a user with a specific email
user := &models.User{
  Email: "hello@example.com",  // assuming this is the Primary key
}

err := db.Delete(user)
if err != nil {
  // handle error
}

Precautions to Take

  1. Backup Data: Always ensure you have a backup of your data before performing delete operations.
  2. Verify Identifiers: Double-check the identifiers to ensure you are deleting the correct record.
  3. Test in a Safe Environment: If possible, test the delete operation in a staging or development environment before executing it in production.
  4. Error Handling: Implement proper error handling to catch and log any issues that occur during the delete operation.
  5. Transaction Management: If the delete operation is part of a larger set of database operations, consider using transactions to ensure atomicity and consistency [3].

By following these steps and precautions, you can safely and effectively delete specific records from the database without using the UI.

To continue talking to Dosu, mention @dosu.

Startrekzky commented 4 days ago

@pamflores-clip , thanks for reporting. We'll look at it.

xiaomingzhang12 commented 3 days ago

Hi @pamflores-clip , In the database, the update_date field represents when the data itself was last updated, while the update_at field indicates when it was updated in the database. Could you check if the update_at field shows the time you were expecting? Thank you image