Sage-Bionetworks / synapsePythonClient

Programmatic interface to Synapse services for Python
https://www.synapse.org
Apache License 2.0
67 stars 68 forks source link

[SYNPY-1298] Update annotation logic #995

Closed BryanFauble closed 11 months ago

BryanFauble commented 11 months ago

Problem: When an entity is created the set_annotations method is always being called even if the entity is being created without any annotations.

A side-effect of this is that it is causing some Integration test instability - It also saves an extra HTTP call when creating a new entity w/o annotations.

  1. https://github.com/Sage-Bionetworks/synapsePythonClient/actions/runs/6501857782/job/17659897335#step:8:344
  2. https://github.com/Sage-Bionetworks/synapsePythonClient/actions/runs/6511652889/job/17687757507#step:8:343

Solution: Checking to see if we are creating the entity in the first place only call set_annotations if there are annotations to be set.

Testing: I added 2 Integration tests around this logic.

BryanFauble commented 11 months ago

🔥 LGTM! is this caused by setting annotations in parallel via the tests?

@thomasyu888 I am uncertain - I spent some time to comb through the logical path this code took and under no circumstances should it have ran into the error: Object: syn12392625 was updated since you last fetched it, retrieve it again and re-apply the update https://github.com/Sage-Bionetworks/synapsePythonClient/actions/runs/6511652889/job/17687757507#step:8:448

Looking at what that specific integration test is doing:

     def test_download_file_false(syn, project, schedule_for_cleanup):
        RENAME_SUFFIX = "blah"

        # Upload a file
        filepath = utils.make_bogus_binary_file()
        schedule_for_cleanup(filepath)
        schedule_for_cleanup(filepath + RENAME_SUFFIX)
        file = File(filepath, name="SYNR 619", parent=project)
>       file = syn.store(file) # <--- Error occured here

It doesn't make sense to me how even if this were running in parallel with other tests how this could have occurred, my reasoning:

  1. Each worker node that the tests are split up into create it's own Project to use for the tests. Session scoped fixtures are ran once per node. We run the gitlab runners with 2 nodes. This is where the project is created: https://github.com/Sage-Bionetworks/synapsePythonClient/blob/develop/tests/integration/conftest.py#L39
  2. Since each worker node has it's own project and on each node the tests are executed sequentially they should never interfere with eachother or cause stateful issues (Unless the tests themselves are referencing the same project/files/entities).
  3. A new file is created during this test run: utils.make_bogus_binary_file() - but the filename is static in this case. To eliminate the file name possibly causing issues with the test in this case I am swapping the file name to this (in a future PR):
    file_name = "SYNR-619-" + str(uuid.uuid4())
    file = File(filepath, name=file_name, parent=project)