getodk / briefcase

ODK Briefcase is a Java application for fetching and pushing forms and their contents. It helps make billions of data points from ODK portable. Contribute and make the world a better place! ✨💼✨
https://docs.getodk.org/briefcase-intro
Other
60 stars 156 forks source link

When pushing to Central, push form definitions for all versions referenced in submissions #872

Closed lognaturel closed 3 years ago

lognaturel commented 4 years ago

Central only accepts submissions for form versions that it is aware of. Aggregate allows limited form versioning but only retains the latest form version. Briefcase only has awareness of the last form version it pulled.

There was a v2 planned to address this and other limitations but unfortunately we've been struggling to get the time and resources to make it happen.

TODO:

This PR:

Acceptance criteria:

ggalmazor commented 4 years ago

Just saw this :) I'm not sure if it's ready, but I will take a lookie look anyway and see how it goes ;)

ggalmazor commented 4 years ago

In released versions, this means that if a user exports submissions and then pulls a new form version, exports "from last" will start at the beginning because the metadata about last exported submission will be lost. @ggalmazor maybe you have some insights.

Oops. That sounds like an unintended behavior. That information shouldn't be lost between pulls

ggalmazor commented 4 years ago

@lognaturel, I did a quick review and I think your approach is OK. Some thoughts/notes (probably things that you already know, though):

lognaturel commented 4 years ago

Thanks so much for taking a look, @ggalmazor, and for all the helpful feedback.

That sounds like an unintended behavior. That information shouldn't be lost between pulls

Interesting! Do you happen to remember why version is included in FormKey? I expected the form metadata repository to be keyed on logical forms rather than specific versions. Is there a reason not to do that?

codecov-commenter commented 3 years ago

Codecov Report

Merging #872 into master will decrease coverage by 1.20%. The diff coverage is 34.04%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #872      +/-   ##
============================================
- Coverage     48.33%   47.12%   -1.21%     
+ Complexity     1647     1621      -26     
============================================
  Files           195      195              
  Lines         10422    10538     +116     
  Branches        753      764      +11     
============================================
- Hits           5037     4966      -71     
- Misses         5026     5219     +193     
+ Partials        359      353       -6     
Impacted Files Coverage Δ Complexity Δ
...rg/opendatakit/briefcase/pull/PullFromCollect.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...c/org/opendatakit/briefcase/ui/push/PushPanel.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
src/org/opendatakit/briefcase/ui/reused/UI.java 43.07% <0.00%> (-2.09%) 8.00 <0.00> (ø)
...transfer/sourcetarget/CentralServerDialogForm.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...d/transfer/sourcetarget/source/FormInComputer.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...reused/transfer/sourcetarget/target/Aggregate.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...i/reused/transfer/sourcetarget/target/Central.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...eused/transfer/sourcetarget/target/PushTarget.java 0.00% <ø> (ø) 0.00 <0.00> (ø)
...takit/briefcase/ui/settings/SettingsPanelForm.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...rg/opendatakit/briefcase/util/TransferFromODK.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b55a171...6726972. Read the comment docs.

lognaturel commented 3 years ago

@getodk/testers There's a lot of behavior here and weird edge cases depending on what versions are already on the Central server and when forms were pulled. I think I've come up with something that makes sense and handles most of them (except for forms with blank versions) but before we do code review it would be helpful for you to do a discovery pass so we identify anything missing or that just doesn't make sense.

kkrawczyk123 commented 3 years ago

@lognaturel I have done some testing today and of course, I will continue but wanted already to share with some feedback:

Another thing that I was thinking about: I don’t really like the fact that draft versions of forms are visible on download list of briefcase ex. Project 166 cascading select test form: downloading result displayed in briefcase is: success with errors what is misleading as in fact the form is not even downloaded, empty folder is created in sd without xml file. (form is not visible on push list what is actually a good result)

lognaturel commented 3 years ago

Thank you, @kkrawczyk123! Great finds all around. I've updated the PR post with todos.

I would love URL on information dialog to be clickable!

Yes, me too. It's not all that straightforward as far as I can tell so let's file it as a separate issue once this is merged.

maybe connected more that version is not defined

That's right, blank versions are not supported. I'd like to treat that as a separate issue, I think. I'll give it another look.

I am not able to download the results. test.central.getodk.org/#/projects/188/forms/all-widgets/submissions

Thanks, this is an issue with Central exports: https://github.com/getodk/central-backend/issues/297.

I realized that project is an encrypted one ale Form got unencrypted

I experienced this as well. I am pretty sure this is something that Central should handle - https://github.com/getodk/central-backend/issues/298.

I don’t really like the fact that draft versions of forms are visible on download list of briefcase

I agree: https://github.com/getodk/briefcase/issues/861

kkrawczyk123 commented 3 years ago

@lognaturel I have another dose of questions/concerns:

Given: I have confirmed a push to Central And: there was a form with the corresponding formId on my Central server And: all versions that Briefcase has submissions for were previously uploaded to Central And: the form was deleted Then: the form is in draft state on my Central server and can't be published because of a version conflict And: I see errors about all of the form versions and submissions failing to upload

Briefcase's status is "Success with errors" what is acceptable as draft form version is pushed but I can see another issue here which is actually more connected to Central than to Briefcase. I went to project and tried to publish draft version and I can't to that, text field to specify version is missing what makes form useless right now. Screenshot from 2020-10-12 11-43-41

Given: I have pulled a form definition and submissions from Aggregate, Central, or a Collect directory from a prior version of Briefcase (<=v1.17.4) And: some submissions are for the active form definition's version and others for prior form versions When: I push that form and submissions to Central Then: submissions for versions other than the version Briefcase knows about fail And: in the details, I see a message about trying again and a reference to documentation

I can't see the difference between case where I am pulling form with submissions of different form versions using Briefcace 1.17.4 or PR-version and then pushing it again to new project on Central using PR-version: all submissions are sent and errors about failing to push media files are displayed only.

Given: I have pulled a form definition and submissions from Aggregate, Central, or a Collect directory from a prior version of Briefcase (<=v1.17.4) And: some submissions are for the active form definition's version and others for prior form versions And: I let a push attempt complete and fail When: I retry Then: all form versions that were not previously uploaded to Central are now on my Central server And: all submissions are pushed And: the form version Briefcase knows about is not the actively published version

Here as a step "I let a push attempt complete and fail" you mean use Briefcase 1.17.4 in first push and the second push using PR-version? I do not fully understand the difference between that scenario and the other above. I have the same results.

I am wondering how the last published version is actually selected on Central? Does it happen randomly? (ex. project 193). See the gif: Peek 2020-10-12 13-40gif4

lognaturel commented 3 years ago

errors about failing to push media files

I'm very puzzled about what's going on here. Do you have a form with a single version and media files that you could do a pull and then push with 1.17.4 with? I suspect there's an existing issue.

I went to project and tried to publish draft version and I can't to that, text field to specify version is missing what makes form useless right now.

I think you got in this state by deleting a form with that form version and then trying to create it again. I agree that this can lead to a user being stuck and will explore with the Central team.

I can't see the difference between case where I am pulling form with submissions of different form versions using Briefcace 1.17.4 or PR-version and then pushing it again to new project on Central using PR-version

You'd have to make sure to clear your Briefcase directory before doing a pull from <=1.17.4 so that the form version cache is cleared. Then also make sure you're pushing to a project that doesn't know about that form at all.

you mean use Briefcase 1.17.4 in first push and the second push using PR-version?

No, the first push would have to be using the PR version. So this is download everything with Briefcase 1.17.4 after having cleared the Briefcase directory. Then go to the PR version and push. There should be failures. Then try again and it should succeed. However, the published version will not be the latest one. This I will describe in documentation. As you noticed:

I am wondering how the last published version is actually selected on Central? Does it happen randomly?

If you already had things pulled in <=1.17.4, there's no information about submission versions. That means Briefcase will only push the form version it knows about and will publish it. ~When you push again, it will know about older form versions and push those in an unpredictable order. Which ever one is last will be published. So yes, the published version is entirely unpredictable in that case. In the case where the pull happened with this PR, the order of the versions before the published one is also unpredictable.~ EDIT: I now push obsolete versions in alphanumeric order so at least coming from Aggregate the order will generally be sensible.

kkrawczyk123 commented 3 years ago

I'm very puzzled about what's going on here. Do you have a form with a single version and media files that you could do a pull and then push with 1.17.4 with? I suspect there's an existing issue.

lognaturel commented 3 years ago

Thanks for all the detailed checks, @kkrawczyk123. Before you dive back in, I encourage you to read through https://github.com/getodk/docs/pull/1272. Hopefully that will clarify the intended behavior for weird edge cases. If not, let me know what parts are unclear!

by "empty submission" you mean one that they don't have submissionDate in exported by Briefcase file

I was seeing instance folders with absolutely no contents, not even submission.xml. It turns out to be related to the export issue you were having -- the QA server just didn't have enough memory to deal with the largest media files. I've now added swap and everything is behaving as I would expect. I'll work on updating Central docs so this is really clear. I didn't expect these issues with relatively small media files and I forgot we didn't have swap on that machine. If there's anything else expected going on related to transfer, memory should be our first suspicion. I think you're right that this is what you experienced when you pulled 256 and ended up pushing 226.

I'm not seeing any failures to push media files anymore so maybe that was also related?

use the same sd

The form versions that need to be created on Central are identified on pull. If you pull on v1.17.4 and use the same storage directory to push using this PR, older form versions won't be pushed because the form version list is stored in the storage directory and v1.17.4 just doesn't record that info. But the process of pushing on this PR generates the list of form versions to push. So once you've attempted a push, that storage directory knows about all the versions, and a future push with this PR will send all form versions. Does that make sense? This might be one to hop on a quick call about if you think there might be a bug or if you're still not feeling clear on the intended behavior (don't worry, it's super hard to reason about).

When I interrupt pushing using Cancel button - the operation is canceled but success status is displayed, see the gif:

Is it possible this only happens if you cancel right after the "Start pushing form definition..." message is emitted but before anything actually gets pushed? I can't get that behavior. If I cancel immediately right after tapping push, the cancel has no effect. This is a bad state but it's the same on v1.17.4. It's also unlikely so I'm not too worried about it. If I push once submissions are sending, I see something like

Submission 10 of 1902 sent
Sending submission 17 of 1902
Submission 11 of 1902 sent
Sending submission 18 of 1902
Submission 12 of 1902 sent
Sending submission 19 of 1902
Submission 13 of 1902 sent
Sending submission 20 of 1902
Cancelled by user
Submission 14 of 1902 sent

In that case, it's annoying because the "Cancelled" message may not be the last one. But I think it's ok.

The most similar case I can get is if I cancel right after form creation:

Start pushing form and submissions
Creating form (all-widgets)
Cancelled by user
Form created (all-widgets)
Operation cancelled - Push form version "2017053001"
Operation cancelled - Publish form version "2017053001"
Operation cancelled - Push form version "2017053003"
Operation cancelled - Publish form version "2017053003"
Operation cancelled - Push form version "2017121301"
Operation cancelled - Publish form version "2017121301"
Operation cancelled - Push form version "2017121302"
Operation cancelled - Publish form version "2017121302"
Operation cancelled - Push form version "2017121304"
Operation cancelled - Publish form version "2017121304"
Operation cancelled - Push form version "2017121307"
Operation cancelled - Publish form version "2017121307"
Operation cancelled - Push form version "2018022001"
Operation cancelled - Publish form version "2018022001"
Operation cancelled - Push form version "2018051401"
Operation cancelled - Publish form version "2018051401"
Operation cancelled - Push form version "2018051409"
Operation cancelled - Publish form version "2018051409"
Operation cancelled - Push form version "2018081601"
Operation cancelled - Publish form version "2018081601"
Operation cancelled - Push form version "2019020201"
Operation cancelled - Publish form version "2019020201"
Operation cancelled - Push form version "2019020208"
Operation cancelled - Publish form version "2019020208"
Operation cancelled - Push form version "2019051302"
Operation cancelled - Publish form version "2019051302"
Operation cancelled - Push form version "2017121308"

In that case it looks like there's no final message.

I have encountered on weird thing while trying to reporoduce issue with failing media. I added form manually to central and send submissions with media attachments from Collect project: test.central.getodk.org/#/projects/198 I can pull it using briefcase but I can't push it to another project where that form was not previously added.

Wow, that was indeed crazy! The problem was that one of the submissions had an XML media attachment. Briefcase was treating that like a form definition and then getting confused because there were two form definitions at different paths. Should be fixed, now.

I believe I have addressed all of the issues you highlighted except that I wasn't able to satisfactorily reproduce the "Success" after cancelling issue. Does that sound right? If not, please summarize issues from above that are still present so we make sure we're on the same page. 🙏

codecov-io commented 3 years ago

Codecov Report

Merging #872 into master will decrease coverage by 1.21%. The diff coverage is 36.41%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #872      +/-   ##
============================================
- Coverage     48.33%   47.11%   -1.22%     
+ Complexity     1647     1621      -26     
============================================
  Files           195      195              
  Lines         10422    10531     +109     
  Branches        753      764      +11     
============================================
- Hits           5037     4962      -75     
- Misses         5026     5217     +191     
+ Partials        359      352       -7     
Impacted Files Coverage Δ Complexity Δ
...rg/opendatakit/briefcase/pull/PullFromCollect.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...c/org/opendatakit/briefcase/ui/push/PushPanel.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
src/org/opendatakit/briefcase/ui/reused/UI.java 43.07% <0.00%> (-2.09%) 8.00 <0.00> (ø)
...transfer/sourcetarget/CentralServerDialogForm.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...d/transfer/sourcetarget/source/FormInComputer.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...reused/transfer/sourcetarget/target/Aggregate.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...i/reused/transfer/sourcetarget/target/Central.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...eused/transfer/sourcetarget/target/PushTarget.java 0.00% <ø> (ø) 0.00 <0.00> (ø)
...takit/briefcase/ui/settings/SettingsPanelForm.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...rg/opendatakit/briefcase/util/TransferFromODK.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b55a171...a94b78b. Read the comment docs.

mmarciniak90 commented 3 years ago

@lognaturel I noticed a different behavior in one case. Probably it's uncommon.

  1. I pulled 3 different versions of form via Form definition - form 1648 from odk.zip
  2. Then I tried to push it to Central

The form was pushed with errors on v1.17.4. It appeared in Central.

Zrzut ekranu 2020-10-20 o 17 22 52

But when I tried this case with these changes, I clicked on the push but nothing happened.

Zrzut ekranu 2020-10-20 o 17 26 34
kkrawczyk123 commented 3 years ago

List of test cases verified/retested so far:

lognaturel commented 3 years ago

It’s looking like we’re almost there!

I clicked on the push but nothing happened

Is it possible the form on Central is a draft and not ending up getting published? What if you publish the draft? I will give this a try hopefully tomorrow.

mmarciniak90 commented 3 years ago

Is it possible the form on Central is a draft and not ending up getting published?

@lognaturel I looked at this case once again and I realized that the attached data are complex so I tried the same case with a much more simple form and the result was correct. It looks like an issue with strange media in form. There are xml files in form media. This is my special form 1648.zip

I pulled this form from Aggregate and from form directory. In both cases, nothing happened when I clicked on the 'Push' button. But still, the released version is able to push it and create a draft version.

kkrawczyk123 commented 3 years ago

and more cases have been verified:

cmd push to Central form with submissions of different versions validation on push encrypted form ui and cmd cmd push to Central form pulled from Aggregate cmd push to Central form pulled from Central cmd push to Central form with submissions of different versions and media cmd push form with media and submissions with media attached cmd push form with submissions of different versions when there is already published form on Central cmd push form with submissions of different versions when there is already draft form on Central cmd pull from odk folder, cmd push to Central cmd form pulled with 1.17.4 and pushed: only form known by briefcase and it's submissions are cmd pushed cmd re-try push: all submissions are pushed, different versions visible on Central. cmd pull Kasia's original form, and cmd push Kasia's original form cmd push without specifying form id I honestly don't have ideas for more, we brainstormed together with @mmarciniak90 @lognaturel I still can reproduce the issue with Briefcase freeze when a submission has xml attachment. Could you please take a look at that once again?

lognaturel commented 3 years ago

The cases look quite complete. I'll look at those two remaining issues ASAP. Thank you!

lognaturel commented 3 years ago

I believe that both of you were running into the issue where form definitions used as media were causing conflicts. I had updated the method to look for form definitions but didn't update the call. It should now be fixed.

kkrawczyk123 commented 3 years ago

We have confirmed that issues have been fixed. We have verified both via ui and cmd and also did quick regression check. We both agree that it's ready to go! Verified on Ubuntu and MacOS.

@getodk-bot unlabel "needs testing" @getodk-bot label "behavior verified"