gradle / ge-export

Apache License 2.0
3 stars 5 forks source link

Error when exporting identical build records #4

Closed w25r closed 7 years ago

w25r commented 7 years ago

If I'm exporting a build that already exists in the database, rather than failing, it should skip that build, and perhaps give a warning or informational message.

w25r commented 7 years ago

There may also be some performance concerns to consider, if we need to verify a build already exists in the database.

wisechengyi commented 7 years ago

i can try to take on this one

wisechengyi commented 7 years ago

There is some complications in terms of transaction completion, for example, the following can be interrupted at any moment.

        currentBuild.setId( insertOrGetBuild(currentBuild) );
        currentBuild.taskMap.values().stream().forEach( TasksDAO::insertTask );
        currentBuild.testMap.values().stream().forEach( TestsDAO::insertTest );
        currentBuild.customValues.stream().forEach(CustomValueDAO::insertCustomValue);

Hence I would like to propose to create a intermediate table in the destination db to keep track of the in-progress syncs, e.g. in_progress_build_syncs, so it would look like:

public static void persist(EventProcessor processor) {
    Build currentBuild = processor.currentBuild;
    currentBuild.resolveStatus();

    // incomplete sync
    if (currentBuild in BUILDS_TABLE && currentBuild in IN_PRGRESS_BUILD_SYNCS_TABLE) {
        delete from builds, tasks, tests, custom_values where build_id = currentBuild.getBuildId()
        insertBuild(mybuild)
    }
    // completed, no action needed
    else if (currentBuild in BUILDS_TABLE && currentBuild not in IN_PRGRESS_BUILD_SYNCS_TABLE){
      // do nothing
    }
    // new builds
    else {
        insertBuild(mybuild)
    }
}

public void insertBuild(Build currentBuild) {
    insert currentBuild into in_progress_build_syncs

    currentBuild.setId( ins(currentBuild) );
    currentBuild.taskMap.values().stream().forEach( TasksDAO::insertTask );
    currentBuild.testMap.values().stream().forEach( TestsDAO::insertTest );
    currentBuild.customValues.stream().forEach(CustomValueDAO::insertCustomValue);

    remove currentBuild from in_progress_build_syncs
}

Do you think that is reasonable, @w25r ?

Edit: changed pseudo code, there are actually 3 cases.

w25r commented 7 years ago

I think we should ignore the incomplete updates for now.

In this step, let's just skip a build record if it's already in the database. You're right that there's then a risk that incomplete data exists. That is its own issue, which we can log separately, and we should try to fix that by making insertBuild transactional and therefore safe. Thoughts?

wisechengyi commented 7 years ago

Ah yes transactional should be the way to go, but I am not an expert in that subject that yet, so this is more of a beginner's workaround.

I think it is fine to skip for now, so the next step should be either making things transaction or implement the workaround. We do care about data integrity to some extent, especially the goal is to pipeline the whole thing when

  1. potential/frequent restarts happen due to various reasons
  2. people may ask why the same builds have so many differences in tasks, tests.
w25r commented 7 years ago

@wisechengyi do you have something ready to go to submit a PR for this? I don't think we need to solve the transaction problem right now, but it would be helpful to not fail if the IDs exist already.

wisechengyi commented 7 years ago

should be able to get the PR out today

wisechengyi commented 7 years ago

13

w25r commented 7 years ago

Fixed by @wisechengyi in #13