databendlabs / databend

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.85k stars 750 forks source link

Potential Bug: Tables Undergoing Vacuum Process Can Be Improperly Undropped #16433

Open drmingdrmer opened 2 months ago

drmingdrmer commented 2 months ago

Current Behavior

When vacuuming a table, the following sequence occurs:

  1. Data is removed from S3.
  2. Table metadata is removed from the meta-service.

However, the current API allows undropping of a dropped table even after its S3 data has been purged, leading to potential data inconsistency.

Impact

This bug can result in: Data loss: Undropping a table with purged S3 data creates an empty table.

Proposed Solution

Introduce a vacuum_status flag in the TableMeta structure to prevent undropping of tables undergoing or having completed the vacuum process.

Revised State Transition Flow:

normal <----------.
  |               |
  | drop_table()  | undrop_table()
  v               |
dropped ----------'
  |
  | vacuum_table()
  v
vacuum-in-progress
  |
  | purge table metadata
  v
completely-removed
drmingdrmer commented 2 months ago

Thanks to @SkyFan2002 for identifying this issue.

dosubot[bot] commented 2 months ago

Hey @drmingdrmer, I'm here to help you with any bugs, questions, or contributions you have. Let's get this sorted together! Dosu

To address the issue of undropping tables that have undergone or are undergoing the vacuum process, you can introduce a vacuum_status flag in the TableMeta structure. This flag will help prevent undropping operations on such tables, ensuring data consistency.

Proposed Changes

  1. Modify the TableMeta Structure: Add a vacuum_status field to the TableMeta structure to track the vacuum status of a table.

    #[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize)]
    pub struct TableMeta {
       pub schema: Arc<TableSchema>,
       pub engine: String,
       pub options: BTreeMap<String, String>,
       pub created_on: DateTime<Utc>,
       pub updated_on: DateTime<Utc>,
       pub drop_on: Option<DateTime<Utc>>,
       pub vacuum_status: Option<VacuumStatus>, // Add this line
    }
    
    #[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize)]
    pub enum VacuumStatus {
       InProgress,
       Completed,
    }
  2. Update Table Creation and Vacuum Processes: Ensure the vacuum_status flag is appropriately set during table creation and vacuum operations.

    async fn create_table(mt: &impl SchemaApi, req: CreateTableReq) -> Result<CreateTableResp, MetaError> {
       let table_meta = TableMeta {
           schema: req.table_meta.schema,
           engine: req.table_meta.engine,
           options: req.table_meta.options,
           created_on: req.table_meta.created_on,
           updated_on: req.table_meta.updated_on,
           drop_on: None,
           vacuum_status: None, // Initialize as None
       };
    
       // Proceed with table creation logic
    }
    
    async fn vacuum_table(mt: &impl SchemaApi, table_id: u64) -> Result<(), MetaError> {
       // Set vacuum_status to InProgress
       let mut table_meta = get_table_meta(mt, table_id).await?;
       table_meta.vacuum_status = Some(VacuumStatus::InProgress);
       update_table_meta(mt, table_id, table_meta.clone()).await?;
    
       // Perform vacuum operation
    
       // Set vacuum_status to Completed
       table_meta.vacuum_status = Some(VacuumStatus::Completed);
       update_table_meta(mt, table_id, table_meta).await?;
       Ok(())
    }
  3. Prevent Undropping of Tables Based on vacuum_status: Modify the undrop_table function to check the vacuum_status before allowing the undrop operation.

    async fn undrop_table(mt: &impl SchemaApi, table_id: u64) -> Result<(), MetaError> {
       let table_meta = get_table_meta(mt, table_id).await?;
       if let Some(vacuum_status) = table_meta.vacuum_status {
           match vacuum_status {
               VacuumStatus::InProgress | VacuumStatus::Completed => {
                   return Err(MetaError::new("Cannot undrop a table that is being or has been vacuumed"));
               }
           }
       }
    
       // Proceed with undrop logic
    }

By implementing these changes, you can effectively prevent the undropping of tables that are undergoing or have completed the vacuum process, thus maintaining data consistency and preventing potential data loss [1][2].

To continue talking to Dosu, mention @dosu.