Closed jstupak closed 9 years ago
Now I am consistently getting timeouts (4 out of 4 tries), after making a rather trivial change to the xml (I made the number of unaddressable and unmaskable pixels associated with individual ROCs rather than a whole module). Since the number of pictures being uploaded did not change, and now I consistently get timeouts, it suggests to me that this problem has little or nothing to do with the number of pictures uploaded, as has been claimed.
The particular module (P-A-2-04) I have been testing with now has a very very large amount of data associated with it. I was successful with the first 15 uploads I tried, then 2 of the next 3 failed, then 4 of the next 4 failed. Are the results of past uploads somehow being re-analyzed when we upload new data?
I looked into the issue, and what I can conclude is that the problem now is due to the number of pictures already in the database, rather than the number of pictures to be uploaded. The database was never meant to handle 3000+ pictures for a single module, and it is now timing out while still assessing where to place the next picture.
So then I am tempted to conclude that this new slimmed-down zip we are uploading is probably okay. We uploaded it so many times just to test and see how reliable the upload was
I have two questions for you John.
1) What is the average number of pictures you expect for any given module?
2) What is the maximum number of pictures you think you would ever need for a module?
The plan is to upload 141 images per module, plus ~20 for modules which are tested at an x-ray center (but these will be uploaded separately). I would anticipate that after testing and uploading results, we will occasionally spot a problem and have to re-test and upload again. So I would say the average is ~200 with a maximum of ~400.
Can we consider this issue closed? If you want, I can put a cap on the number of pictures it is possible to associate to a module to ensure this doesn't happen again.
Sure, we can consider it closed since we didnt see any issue in recent uploads. If need be, I can open a new issue in the future.
I wouldn't bother creating the cap. This was a one time only stress test
Apparently pull requests can't be reopened, so I am starting a continuation of #37.
I just tried uploading a slimmed-down zip file 18 times. 16 times the upload was successful. One time it failed after processing just 43 images. So if we want this to work reliably, we should stick to ~20 plots per zip file. We want to upload 141 plots, so we would have to upload ~7 zip files to get the images we want into the DB, and who knows how many to get the 70 configs into the DB. Not really a practical option.
The best option I see at this point is to create an easy way for shifters to request that a set of test results for a given module be wiped clean. It is clear to the shifters when an upload fails, so they could just request to have the partial test results removed, and try again to upload. This avoids having partial duplicate entries in the DB. I know Kamal didnt want to give us permission to delete things ourselves, but maybe just a button which would request for someone at Purdue to do the delete would be okay? It would be nice if the shifter could immediately upload new results, instead of waiting for the cleanup to happen, so the request should include a time or something so it is clear what should be deleted (only test results uploaded within a ~10 minute window prior to the time specified in the request).
This feature would also be useful in case a shifter uploads results but we later realize there was an issue with the HV or something and want to remove the junk results.