anza-xyz / move

Move compiler targeting llvm supported backends
https://discord.gg/wFgfjG9J
Apache License 2.0
107 stars 32 forks source link

Future: Investigate using analysis/optimization framework from move-prover. #204

Open nvjle opened 1 year ago

nvjle commented 1 year ago

Currently we use only the baseline move-prover generated stackless bytecode as input to our translation. However, the prover also provides a number of potentially useful analyses and optimizations that could improve our input code. Moreover, some of these are fairly nice generic frameworks (e.g., a generic dataflow analysis framework and a generic interprocedural compositional analysis framework) that we could build our own stackless bytecode analyses and optimizations on.

Some of the analyses/optimizations are targeted only to the "specification" language, but others are generally applicable. Below is their default pipeline:

pub fn default_pipeline_with_options(options: &ProverOptions) -> FunctionTargetPipeline {
    // NOTE: the order of these processors is import!
    let mut processors: Vec<Box<dyn FunctionTargetProcessor>> = vec![
        DebugInstrumenter::new(),
        // transformation and analysis
        EliminateImmRefsProcessor::new(),
        MutRefInstrumenter::new(),
        ReachingDefProcessor::new(),
        LiveVarAnalysisProcessor::new(),
        BorrowAnalysisProcessor::new_borrow_natives(options.borrow_natives.clone()),
        MemoryInstrumentationProcessor::new(),
        CleanAndOptimizeProcessor::new(),
        UsageProcessor::new(),
        VerificationAnalysisProcessor::new(),
    ];

    if !options.skip_loop_analysis {
        processors.push(LoopAnalysisProcessor::new());
    }

    processors.append(&mut vec![
        // spec instrumentation
        SpecInstrumentationProcessor::new(),
        GlobalInvariantAnalysisProcessor::new(),
        GlobalInvariantInstrumentationProcessor::new(),
        WellFormedInstrumentationProcessor::new(),
        DataInvariantInstrumentationProcessor::new(),
        // monomorphization
        MonoAnalysisProcessor::new(),
    ]);

    if options.mutation {
        // pass which may do nothing
        processors.push(MutationTester::new());
    }

    // inconsistency check instrumentation should be the last one in the pipeline
    if options.check_inconsistency {
        processors.push(InconsistencyCheckInstrumenter::new());
    }

    if !options.for_interpretation {
        processors.push(NumberOperationProcessor::new());
    }
    ...

Of course we're generating LLVM IR and we let LLVM do much of the optimization heavy lifting. But in addition to the fact that running the prover pipeline may give us better input SBC to begin with, there may be other domain/Move-specific analyses/optimizations that are easier to apply on SBC as opposed to later in LLVM.

ksolana commented 1 year ago

These are good to have but we can postpone this work until stage 3.

jcivlin commented 10 months ago

These are good to have but we can postpone this work until stage 3.

I found that compilation of stdlib (ascii.moe, in particular, but this will chain vector.move and more) generates byte code Prop, which we do not support. I think it is coming from move-prover and due to usage of Move spec in stdlib code.

So the optimizations may wait but we need to take a deeper look at move-prover sooner.

jcivlin commented 10 months ago

see more about Prop here: https://github.com/solana-labs/move/issues/250

dmakarov commented 10 months ago

When I compile unit tests using move-cli with Solana backend., everything compiles correctly, including the test_ascii_chars. Here's the generated llvm code for that test.

define private void @ascii_tests__test_ascii_chars() {
entry:
  %local_0 = alloca { ptr, i64, i64 }, align 8
  %local_1 = alloca i8, align 1
  %local_2 = alloca %struct.ascii__String, align 8
  %local_3 = alloca { ptr, i64, i64 }, align 8
  %local_4 = alloca i8, align 1
  %local_5 = alloca { ptr, i64, i64 }, align 8
  %local_6 = alloca i8, align 1
  %local_7 = alloca i8, align 1
  %local_8 = alloca i1, align 1
  %local_9 = alloca i8, align 1
  %local_10 = alloca i1, align 1
  %local_11 = alloca i64, align 8
  %local_12 = alloca ptr, align 8
  %local_13 = alloca i8, align 1
  %local_14 = alloca i8, align 1
  %local_15 = alloca i8, align 1
  %local_16 = alloca i8, align 1
  %local_17 = alloca { ptr, i64, i64 }, align 8
  %local_18 = alloca %struct.ascii__String, align 8
  %local_19 = alloca ptr, align 8
  %local_20 = alloca ptr, align 8
  %local_21 = alloca i64, align 8
  %local_22 = alloca i64, align 8
  %local_23 = alloca i1, align 1
  %local_24 = alloca i64, align 8
  %local_25 = alloca ptr, align 8
  %local_26 = alloca i1, align 1
  %local_27 = alloca i1, align 1
  %local_28 = alloca i64, align 8
  %local_29 = alloca %struct.ascii__String, align 8
  %local_30 = alloca { ptr, i64, i64 }, align 8
  %local_31 = alloca ptr, align 8
  %local_32 = alloca i64, align 8
  %local_33 = alloca i64, align 8
  %local_34 = alloca i1, align 1
  %local_35 = alloca i64, align 8
  store i8 0, ptr %local_4, align 1
  %load_store_tmp = load i8, ptr %local_4, align 1
  store i8 %load_store_tmp, ptr %local_1, align 1
  %retval = call { ptr, i64, i64 } @move_native_vector_empty(ptr @__move_rttydesc_u8)
  store { ptr, i64, i64 } %retval, ptr %local_5, align 8
  %load_store_tmp1 = load { ptr, i64, i64 }, ptr %local_5, align 8
  store { ptr, i64, i64 } %load_store_tmp1, ptr %local_3, align 8
  br label %bb_6

bb_6:                                             ; preds = %join_bb, %entry
  %load_store_tmp2 = load i8, ptr %local_1, align 1
  store i8 %load_store_tmp2, ptr %local_6, align 1
  store i8 -128, ptr %local_7, align 1
  %lt_src_0 = load i8, ptr %local_6, align 1
  %lt_src_1 = load i8, ptr %local_7, align 1
  %lt_dst = icmp ult i8 %lt_src_0, %lt_src_1
  store i1 %lt_dst, ptr %local_8, align 1
  %cnd = load i1, ptr %local_8, align 1
  br i1 %cnd, label %bb_1, label %bb_0

bb_1:                                             ; preds = %bb_6
  br label %bb_2

bb_2:                                             ; preds = %bb_1
  %load_store_tmp3 = load i8, ptr %local_1, align 1
  store i8 %load_store_tmp3, ptr %local_9, align 1
  %call_arg_0 = load i8, ptr %local_9, align 1
  %retval4 = call i1 @ascii__is_valid_char(i8 %call_arg_0)
  store i1 %retval4, ptr %local_10, align 1
  %cnd5 = load i1, ptr %local_10, align 1
  br i1 %cnd5, label %bb_4, label %bb_3

bb_4:                                             ; preds = %bb_2
  br label %bb_5

bb_3:                                             ; preds = %bb_2
  store i64 0, ptr %local_11, align 8
  %call_arg_06 = load i64, ptr %local_11, align 8
  call void @move_rt_abort(i64 %call_arg_06)
  unreachable

bb_5:                                             ; preds = %bb_4
  store ptr %local_3, ptr %local_12, align 8
  %load_store_tmp7 = load i8, ptr %local_1, align 1
  store i8 %load_store_tmp7, ptr %local_13, align 1
  %loaded_alloca = load ptr, ptr %local_12, align 8
  call void @move_native_vector_push_back(ptr @__move_rttydesc_u8, ptr %loaded_alloca, ptr %local_13)
  %load_store_tmp8 = load i8, ptr %local_1, align 1
  store i8 %load_store_tmp8, ptr %local_14, align 1
  store i8 1, ptr %local_15, align 1
  %add_src_0 = load i8, ptr %local_14, align 1
  %add_src_1 = load i8, ptr %local_15, align 1
  %add_dst = add i8 %add_src_0, %add_src_1
  %ovfcond = icmp ult i8 %add_dst, %add_src_0
  br i1 %ovfcond, label %then_bb, label %join_bb

then_bb:                                          ; preds = %bb_5
  call void @move_rt_abort(i64 4017)
  unreachable

join_bb:                                          ; preds = %bb_5
  store i8 %add_dst, ptr %local_16, align 1
  %load_store_tmp9 = load i8, ptr %local_16, align 1
  store i8 %load_store_tmp9, ptr %local_1, align 1
  br label %bb_6

bb_0:                                             ; preds = %bb_6
  %load_store_tmp10 = load { ptr, i64, i64 }, ptr %local_3, align 8
  store { ptr, i64, i64 } %load_store_tmp10, ptr %local_17, align 8
  %call_arg_011 = load { ptr, i64, i64 }, ptr %local_17, align 8
  %retval12 = call %struct.ascii__String @ascii__string({ ptr, i64, i64 } %call_arg_011)
  store %struct.ascii__String %retval12, ptr %local_18, align 8
  %load_store_tmp13 = load %struct.ascii__String, ptr %local_18, align 8
  store %struct.ascii__String %load_store_tmp13, ptr %local_2, align 8
  store ptr %local_2, ptr %local_19, align 8
  %call_arg_014 = load ptr, ptr %local_19, align 8
  %retval15 = call ptr @ascii__as_bytes(ptr %call_arg_014)
  store ptr %retval15, ptr %local_20, align 8
  %loaded_alloca16 = load ptr, ptr %local_20, align 8
  %retval17 = call i64 @move_native_vector_length(ptr @__move_rttydesc_u8, ptr %loaded_alloca16)
  store i64 %retval17, ptr %local_21, align 8
  store i64 128, ptr %local_22, align 8
  %eq_src_0 = load i64, ptr %local_21, align 8
  %eq_src_1 = load i64, ptr %local_22, align 8
  %eq_dst = icmp eq i64 %eq_src_0, %eq_src_1
  store i1 %eq_dst, ptr %local_23, align 1
  %cnd18 = load i1, ptr %local_23, align 1
  br i1 %cnd18, label %bb_8, label %bb_7

bb_8:                                             ; preds = %bb_0
  br label %bb_9

bb_7:                                             ; preds = %bb_0
  store i64 0, ptr %local_24, align 8
  %call_arg_019 = load i64, ptr %local_24, align 8
  call void @move_rt_abort(i64 %call_arg_019)
  unreachable

bb_9:                                             ; preds = %bb_8
  store ptr %local_2, ptr %local_25, align 8
  %call_arg_020 = load ptr, ptr %local_25, align 8
  %retval21 = call i1 @ascii__all_characters_printable(ptr %call_arg_020)
  store i1 %retval21, ptr %local_26, align 1
  %not_src = load i1, ptr %local_26, align 1
  %not_dst = xor i1 %not_src, true
  store i1 %not_dst, ptr %local_27, align 1
  %cnd22 = load i1, ptr %local_27, align 1
  br i1 %cnd22, label %bb_11, label %bb_10

bb_11:                                            ; preds = %bb_9
  br label %bb_12

bb_10:                                            ; preds = %bb_9
  store i64 1, ptr %local_28, align 8
  %call_arg_023 = load i64, ptr %local_28, align 8
  call void @move_rt_abort(i64 %call_arg_023)
  unreachable

bb_12:                                            ; preds = %bb_11
  %call_arg_024 = load %struct.ascii__String, ptr %local_2, align 8
  %retval25 = call { ptr, i64, i64 } @ascii__into_bytes(%struct.ascii__String %call_arg_024)
  store { ptr, i64, i64 } %retval25, ptr %local_30, align 8
  %load_store_tmp26 = load { ptr, i64, i64 }, ptr %local_30, align 8
  store { ptr, i64, i64 } %load_store_tmp26, ptr %local_0, align 8
  store ptr %local_0, ptr %local_31, align 8
  %loaded_alloca27 = load ptr, ptr %local_31, align 8
  %retval28 = call i64 @move_native_vector_length(ptr @__move_rttydesc_u8, ptr %loaded_alloca27)
  store i64 %retval28, ptr %local_32, align 8
  store i64 128, ptr %local_33, align 8
  %eq_src_029 = load i64, ptr %local_32, align 8
  %eq_src_130 = load i64, ptr %local_33, align 8
  %eq_dst31 = icmp eq i64 %eq_src_029, %eq_src_130
  store i1 %eq_dst31, ptr %local_34, align 1
  %cnd32 = load i1, ptr %local_34, align 1
  br i1 %cnd32, label %bb_14, label %bb_13

bb_14:                                            ; preds = %bb_12
  br label %bb_15

bb_13:                                            ; preds = %bb_12
  store i64 2, ptr %local_35, align 8
  %call_arg_033 = load i64, ptr %local_35, align 8
  call void @move_rt_abort(i64 %call_arg_033)
  unreachable

bb_15:                                            ; preds = %bb_14
  ret void
}

The command to reproduce the compilation follows

LLVM_SYS_150_PREFIX=/path/to/move-dev \
PLATFORM_TOOLS_ROOT=/path/to/platform-tools \
MOVE_NATIVE=/path/to/move/language/move-native \
cargo run -p move-cli --features solana-backend --bin move -- test --solana -p language/tools/move-cli/tests/move_unit_tests/assign_dev_addr_for_dep

This will compile all of stdlib source file along with all stdlib unit test files. As it happens, any Move unit test depends on stdlib, and when move-cli builds the stdlib in test mode, it also compiles all stdlib unit tests.