aptos-labs / aptos-core

Aptos is a layer 1 blockchain built to support the widespread use of blockchain through better technology and user experience.
https://aptosfoundation.org
Other
6.18k stars 3.65k forks source link

[Bug] Query on Re-entrancy vs. Cross-Module Calls in Move VM's set_new_call_frame #13746

Open arthur19q3 opened 4 months ago

arthur19q3 commented 4 months ago

🐛 Bug?

I'm currently delving into the set_new_call_frame function within the MoveVM codebase and have identified a section that raises some questions for me regarding the handling of module re-entrancy. The code snippet below is designed to prevent re-entrancy by checking if a module ID is already present in the active_modules collection:

fn set_new_call_frame(
    &mut self,
    current_frame: &mut Frame,
    gas_meter: &mut impl GasMeter,
    loader: &Loader,
    func: Arc<Function>,
    ty_args: Vec<Type>,
) -> VMResult<()> {
    // Check if the module ID of the function to be called is already active
    match (func.module_id(), current_frame.function.module_id()) {
        // If the current function and the function to be called belong to different modules,
        // but the target module has already been activated
        (Some(module_id), Some(current_module_id)) if module_id != current_module_id => {
            // The logic below may be problematic because it does not distinguish between
            // cross-module calls and recursive calls.
            if self.active_modules.contains(module_id) {
                // If the module ID already exists, the code returns an error indicating that
                // re-entrancy has been detected. However, this might incorrectly identify a legitimate
                // cross-module call as a recursive call.
                return Err(self.set_location(
                    PartialVMError::new(StatusCode::RUNTIME_DISPATCH_ERROR).with_message(
                        format!(
                            "Re-entrancy detected: {} already exists on top of the stack",
                            module_id
                        ),
                    ),
                ));
            }
            // Add the new module's ID to the set of active modules
            self.active_modules.insert(module_id.clone());
        },
        // Other matching logic...
        _ => (),
    }

    // The rest of the function implementation...
}

My inquiry pertains to distinguishing between actual re-entrancy and legitimate cross-module function invocations. Could the current mechanism be considered overly restrictive, and if so, how might we refine it to better accommodate legitimate cross-module interactions without compromising security?

I'm seeking industry insights or recommendations on how to achieve a balanced approach in our VM's access control logic.

Thank you for your expertise and suggestions.

runtian-zhou commented 3 months ago

Thank you for your questions here. This logic is actually closely related to our new dispatchable token standard. You may see the justification logic in the doc if you need complete context.

To answer your question shortly, if your code don't use dynamic dispatch as introduced in the dispatchable token standard, you shouldn't be affected by those checks, as we statically enforce move modules dependencies to be acyclic on chain.

This re-entrant rule is indeed a bit overly restricted at the moment but we believe it's crucial to maintain the core safety properties of Move on chain, i.e: reference safety and re-entrancy safety. You can look into the AIP and there should be very detailed discussion here. To me, I'm fine with the constraint as is because solana also have similar constraints on module level re-entrancy. It could bring some challenges to module developers tho because now the module developer will need to think about what are the modules that would locate in the lower level of stack that is not re-enterable.

Feel free to send me messages if you have more question here! Really appreciate raising this issue here and looking forward to your feedbacks.

github-actions[bot] commented 2 months ago

This issue is stale because it has been open 45 days with no activity. Remove the stale label or comment - otherwise this will be closed in 15 days.