Closed alvacouch closed 7 years ago
@alvacouch What about www? Can you do the same check there?
WWW is much harder to test because it's prior to the ResourceFile normalization. Possible but requires me to run the migration (at least in spirit) in order to perform the test. Lots more work because the migration is run in a different syntax than production. I would need to port it.
Is this necessary?
@alvacouch Don't worry about it then.
@pkdash Will look into checking www after I finalize this version. Would that be a branch off of master
?
@alvacouch If you want this validation check to run on www prior to 1.10 release then the branch needs to be off of 'master' so that @mjstealey can apply it as a hotfix to master. However, I think in this particular case of hotfix the master should not be merged back to 'develop' since the develop would have the real validator based on the new file api.
Here are the results of the first runs of the validation script:
OUTPUT-develop.txt (EDIT: updated for new script) OUTPUT-www.txt
These do not compensate for misplaced hydroshareUserProxy resources. EDIT: These are fixed in the new script for develop
I checked several of the "problem" resources in OUTPUT-www.txt and found that quite a few are not a problem.
It would be good to avoid reporting these so we can see the ones that are problems more easily.
The concering errors I found (in spot checking) are ERROR: no valid file name defined for 6b165657dc1f437a873c58d07c3d97a9 (NetcdfResource) check_irods_files: django file None in folder None, resolved to None, does not exist in iRODS ERROR: no valid file name defined for 6b165657dc1f437a873c58d07c3d97a9 (NetcdfResource) check_irods_files: django file None in folder None, resolved to None, does not exist in iRODS check_irods_files: listing of iRODS directory 6b165657dc1f437a873c58d07c3d97a9/data/contents failed check_irods_files: affected resource 6b165657dc1f437a873c58d07c3d97a9 type is NetcdfResource, title is 'This resource can't be deleted'
This is a resource that @gantian127 owns and she has flagged as unable to delete. I suggest we just delete this as admin.
Some resources indicated to be federated that appear to have been created from iRODS because they are bigger than 1 GB. Some of these are iUtah resources, so we should work with @horsburgh and @AmberSJones to diagnose. WARNING: federated file name or path declared for unfederated resource 094da7d9400f493fb1e412df015e17a4 (GenericResource): data/contents/annual-reports-by-year.zip INFO: data/contents/ stripped from fed name or path: annual-reports-by-year.zip for 094da7d9400f493fb1e412df015e17a4 (GenericResource) check_irods_files: file 094da7d9400f493fb1e412df015e17a4/data/contents/2009-and-later-annual-reports-coded (3).csv in iRODs does not exist in Django check_irods_files: file 094da7d9400f493fb1e412df015e17a4/data/contents/2009-and-later-variable-codebook (3).csv in iRODs does not exist in Django
For the below the UI indicates 4 files suffixed 2013-2016. If the bag is downloaded it holds 4 files. So the file iUTAH_GAMUT_RB_TM_C_RawData_2017.csv does not appear to be in the system and the user does not see any error.
ERROR: existing path aff4e6dfc09a4070ac15a6ec0741fd02/data/contents/iUTAH_GAMUT_RB_TM_C_RawData_2017.csv is not conformant for aff4e6dfc09a4070ac15a6ec0741fd02 (GenericResource)
ERROR: no valid file name defined for aff4e6dfc09a4070ac15a6ec0741fd02 (GenericResource)
check_irods_files: django file aff4e6dfc09a4070ac15a6ec0741fd02/data/contents/iUTAH_GAMUT_RB_TM_C_RawData_2017.csv in folder None, resolved to None, does not exist in iRODS
check_irods_files: affected resource aff4e6dfc09a4070ac15a6ec0741fd02 type is GenericResource, title is 'iUTAH GAMUT Network Raw Data at Todd's Meadow Climate Site (RB_TM_C)'
@pkdash @hyi @aphelionz Just a quick note. I remember now that in revising ResourceFile
, I took some pains to make sure that ResourceFile.delete
works properly now. It didn't work properly before. This could explain a lot of the garbage I am finding as unreferenced iRODS files, especially in the directories with a lot of churn. In other words, it might be a bug that has already been squashed. .
@alvacouch Great. Then I think those unreferenced iRODS files can be safely deleted to get us to a cleaner baseline, perhaps you can provide a switch in your management command to allow admin to delete these unreferenced irods files if they decided it is safe to do so.
@hyi @pkdash @dtarb @mjstealey A new output from revised check_irods_files has been uploaded to the google drive.
Highlights:
Remaining problems:
guid | type | title | problem |
---|---|---|---|
094da7d9400f493fb1e412df015e17a4 | GenericResource | Utah Municipalities Stormwater Annual Reports | files not in Django |
aff4e6dfc09a4070ac15a6ec0741fd02 | GenericResource | iUTAH GAMUT Network Raw Data at Todds Meadow Climate Site (RB_TM_C) | files not in iRODS |
5e80dd7cbaf04a5e98d850609c7e534b | GenericResource | iUTAH GAMUT Network Raw Data at Knowlton Fork Climate Site (RB_KF_C) | files not in iRODS |
325b21d55b2c49658a91944fabd896cf | GenericResource | iUTAH GAMUT Network Raw Data at the Green Infrastructure Climate Site (RB_GIRF_C) | files not in iRODs |
9e5e99125d1646c69dde9fc43e137667 | GenericResource | iUTAH GAMUT Network Raw Data at Fort Douglas Storm Drain (RB_FortD_SD) | files not in iRODs |
3ebf244bd2084cfaa68b83b7f91e9587 | GenericResource | iUTAH GAMUT Network Raw Data at Trial Lake Climate Site (PR_TL_C) | files not in iRODs |
6aa75450ee2744cdb34ed8dde929a84a | GenericResource | iUTAH GAMUT Network Raw Data at Charleston Climate Site (PR_CH_C) | files not in iRODs |
887180409e4545018c8372f0bd6f8ff3 | GenericResource | iUTAH GAMUT Network Raw Data at Provo River near Charleston Advanced Aquatic Site (PR_CH_AA) | files not in iRODs |
4a5bb3a976004f0ea63991323335b170 | GenericResource | iUTAH GAMUT Network Raw Data at Beaver Divide Climate Site (PR_BD_C) | files not in iRODs |
ecb77926c2484e068f28acda434f8772 | GenericResource | iUTAH GAMUT Network Raw Data at Logan River near the Water Lab Advanced Aquatic Site (LR_WaterLab_AA) | files not in iRODs |
40655b4fc21142d090a5a4b835c14220 | GenericResource | iUTAH GAMUT Network Raw Data at Red Butte Gate Basic Aquatic Site (RB_RBG_BA) | files not in iRODs |
1846b79a648a4088aad987cc7241656f | GenericResource | iUTAH GAMUT Network Raw Data at Red Butte Creek near 900 W (1300 South) Basic Aquatic Site (RB_900W_BA) | files not in iRODs |
94bcad20fbfb4c44ac7f98a0fdfa5e79 | GenericResource | iUTAH GAMUT Network Raw Data at Blacksmith Fork above confluence with Logan River (BSF_CONF_BA) | files not in iRODs |
a22bbdfb431c44a68959534c94e96392 | GenericResource | iUTAH GAMUT Network Raw Data near Connor Road Storm Drain Site (RB_CR_SD) | files not in iRODs |
bc16655330b64bcaa366d464b00e45f0 | GenericResource | iUTAH GAMUT Network Raw Data near Dentistry Building Storm Drain (RB_Dent_SD) | files not in iRODs |
2e9db97be020401c9aa03017cb7ee505 | GenericResource | iUTAH GAMUT Network Raw Data near Green Infrastructure Storm Drain (RB_GIRF_SD) | files not in iRODs |
c3ecee31a0c64490bf6a2fcb4841cee4 | GenericResource | iUTAH GAMUT Network Raw Data at Red Butte Creek at 1300 East Aquatic (RB_1300E_A) | files not in iRODs |
a56608d8948c43fdb302e1438cf09169 | GenericResource | iUTAH GAMUT Network Raw Data at Lower Knowlton Fork Aquatic (RB_LKF_A) | files not in iRODs |
9700b80f5cfa42d4a52c9aaab81a4e11 | CollectionResource | Freshwaterhack project: Comparing spatial datasets | files not in iRODs |
cde532b5d39141db9c2b22122774afae | GenericResource | iUTAH GAMUT Network Quality Control Level 1 Data at Knowlton Fork Climate (RB_KF_C) | files not in iRODs |
86a27290e1b443a488f0b84cb9e2af91 | GenericResource | iUTAH GAMUT Network Quality Control Level 1 Data at Climate Station at Logan River Golf Course (LR_GC_C) | files not in iRODs |
bb41efc853134d0a90fa1da0041367f5 | GenericResource | iUTAH GAMUT Network Quality Control Level 1 Data at Lower Knowlton Fork Aquatic (RB_LKF_A) | files not in iRODs |
200a03e04591410f8b6310b43558634b | GenericResource | iUTAH GAMUT Network Quality Control Level 1 Data at Climate Station at TW Daniels Experimental Forest (LR_TWDEF_C) | files not in iRODs |
f83c4a6ddaec4085bd152dd261a1a89c | GenericResource | iUTAH GAMUT Network Quality Control Level 1 Data at Above Red Butte Reservoir Advanced Aquatic (RB_ARBR_AA) | files not in iRODs |
878093a81b284ac8a4f65948b1c597a2 | GenericResource | iUTAH GAMUT Network Raw Data at USGS Gage 10172200 above Red Butte Reservoir (RB_ARBR_USGS) | files not in iRODs |
b5f0873404b941ef982df72e90fc140c | GenericResource | iUTAH GAMUT Network Raw Data at Provo River at Charleston Central Utah Water Conservancy District Gage (PR_CH_CUWCD) | files not in iRODs |
7f0392828f01467386102ae4b52c3b5a | NetcdfResource | Spatial-temporal statistics of monthly soil moisture data from the NLDAS model (1979-2013) | files not in Django |
fef58369046c4a64a2d7564c4e7e1fd0 | NetcdfResource | Spatial-temporal statistics of monthly evapotranspiration data from the NLDAS model (1979-2013) | missing from both Django and iRODs |
fc00c8eaa0944a4a98ea2ddbfe54320e | NetcdfResource | Spatial-temporal statistics of monthly precipitation data from the NLDAS model (1979-2013) | files not in Django |
ba64d962eb6c460abc9a8628946df116 | NetcdfResource | Spatial-temporal statistics of monthly temperature data from the NLDAS model (1979-2013) | files not in Django |
3f354dd111f24998b37099ebdf478441 | NetcdfResource | Spatial-temporal statistics of monthly surface runoff data from the NLDAS model (1979-2013) | files not in Django |
c9fb977bae21432b8b202f13b62285b1 | NetcdfResource | Spatial-temporal statistics of daily soil moisture data from the NLDAS model (1979-2013) | files missing from Django and iRODs |
f42f1387d7d54d7a9228888381d7c30e | NetcdfResource | Spatial-temporal statistics of daily precipitation data from the NLDAS model (1979-2013) | files missing from Django and iRODs |
ff2e648104254ee4bcf8db925170ea91 | NetcdfResource | Spatial-temporal statistics of daily temperature data from the NLDAS model (1979-2013) | files missing from Django and iRODs |
fbc7af608a324a7a9cbbdd415d0a9499 | NetcdfResource | Spatial-temporal statistics of daily surface runoff data from the NLDAS model (1979-2013) | files missing from Django and iRODs |
fe6edc72a982454b8a86aacd7cfbaf74 | GenericResource | Test | files missing from Django and iRODS |
38a881fd35af49448b483f0343ca60e5 | CollectionResource | Freshwaterhack project: Comparing spatial datasets | files not in iRODs |
faacb77f1a8144c4a232edd8ffdd179b | MODFLOWModelInstanceResource | Metadata | files missing from Django and iRODs |
0183ec4000f644fa9378cf28cfe5c2e2 | GenericResource | Sauk River Basin Observatory | files missing from Django and iRODs |
e81b1fb3cb5a49538d6c2ad3077b7b71 | GenericResource | Test - REMOVE ME | files missing from Django and iRODs |
cdc6292fbee24dfd9810da7696a40dcf | CompositeResource | Comparison of hydrodynamic and low-complexity flood modeling tools | files missing from Django and iRODs |
3f7680cf83dc426e858d5b48cb95a565 | GenericResource | Green Infrastructure Designer with RHESSys Workflow | files missing from Django and iRODs |
dfae7f297db749ccac0c85a7bef56582 | GenericResource | Bayou Fountain | iRODs resource missing |
098bcea9945f4a00ba0be5a84096aa19 | GenericResource | Bayou Fountain | iRODs resource missing |
65593e64416b4fa2a6d58971546c9713 | GenericResource | Bayou Fountain | iRODs resource missing |
de42a9f014c344578d96c1717a520786 | GeographicFeatureResource | cTurnipseed_homewatershed | iRODs resource missing |
95b75ee546b2479c80e1895f95f6d2a1 | GenericResource | Chiamaka.Oyekwe-Madumelu_WaterShed | iRODs resource missing |
26b015134eb541b1a1c6587b71cd3fc8 | GenericResource | Fort Bend Hand Practice | iRODs resource missing |
e46ea1e2c3f24d5ba2f64ce356e241ce | ModelProgramResource | My new collection | iRODs resource missing |
06a765609dc74a5090290ef34682f4ba | ModelInstanceResource | ADCIRC - serial - testcase | iRODs resource missing |
d0cbc743fc8e4a16bce2a6377a182e41 | CompositeResource | HydroShare Overview: Managing and Sharing Research Data Using HydroShare | iRODs resource missing |
67607a3752514947b7eaa92a0ce6ef5f | GenericResource | Onion Creek | iRODs resource missing |
9990bc18925b429aae35c142bea235da | NetcdfResource | UEB model simulation of snow water equivalent in Logan River watershed from 2008 to 2009 | iRODs resource missing |
438d578db3e2426cb1a14a939d0b36f0 | GenericResource | Hello From JupyterHub | iRODs resource missing |
946d1e62ed4c457db15f679e6bedc258 | GenericResource | Hello From JupyterHub | iRODs resource missing |
a2d59cd6d696401e90c159bc965a3ca9 | CollectionResource | Presentations about HydroShare | iRODs resource missing |
26d30f89e31f481bb63a4e089dfa1340 | CompositeResource | HydroShare and Model Sharing: Presentation to IWRSS Model Registry Team, Nov 8, 2016 | iRODs resource missing |
23c05d3177654a9ab9dc9023d00d16ed | CompositeResource | Supporting files for python tool subset_nwm_netcdf 1.1.4 | iRODs resource missing |
2773de0c379d4df4bed0b301b4525382 | GenericResource | Sentinel-2 Spectral Response | iRODs resource missing |
271d64a09da9460c919603b7bd5e9b29 | CompositeResource | A Subset of NWM Ver1.1 20170419 results for TwoMileCreek watershed at Tuscaloosa, Alabama | iRODs resource missing |
fa3c6b47370e4367b5c71d36def1d4f4 | GenericResource | IDEAS for GI | iRODS resource missing |
7feec694d0b140b5991ce20135c1dcef | GenericResource | DeadRun Discharge Observation Data | iRODs resource missing |
2e295531907844b985d5c1b95bf65420 | GenericResource | BoxElderCounty | iRODs resource missing |
b43e212dc4af45f0958bee1e94f6949e | GenericResource | DeadRun RHESSys model results | iRODs resource missing |
Scroll to right in table to see what's wrong. The whole table didn't fit. Please suggest actions to take.
Key to the above list:
notation | meaning |
---|---|
iRODs resource missing | whole resource tree (starting with short_id) is not present in iRODs |
files not in iRODs | There are files present in Django ResourceFiles that are not in iRODS |
files not in Django | There are files present in iRODs that are missing from Django |
files missing from Django and iRODS | There are files in Django that are not in iRODS, and other files that are in iRODs but not in Django |
@dtarb @hyi @pkdash @mjstealey @horsburgh
Jeff, please weigh in on the above list. Which of the above resources are yours that were being manipulated by the REST API?
After sleeping on this issue, I have the following recommendations:
condition | likely cause | recommendations |
---|---|---|
File not present in Django | failed ResourceFile.delete in REST API leaves files behind -- fixed in 1.10.0 |
delete file in iRODS using cleanup script. |
File not present in iRODS | reason unknown; best (unverified) guess is that the uploaded file was too large | research individually, make it possible to delete these in the REST API. |
Resource tree not present | failed delete_resource is best theory |
delete resource in Django |
Comments:
ResourceFile
fix, I discovered that the ResourceFile.delete
(as used in the REST API) was leaving iRODS files behind. This is likely the cause of "files not in Django". Recommendation is to automagically clean these up without notifying anyone. User would try to delete the file in the REST API and the file would still be there. This is the only way this could have happened, AFAIK. delete_resource
. This routine deletes the iRODs stuff first, and then deletes the Django. My working theory is that the code crashed after deleting from iRODs. Cleanup is to delete the resource from Django, which may require some adjustment due to the fact that the resource files are already deleted. This course of action requires three hotfixes (because they have to run on www):
ResourceFile
from Django that is already not present in iRODs. Same annotation as above. Your thoughts?
There is one case in which a pair of errors ('missing in iRODs' and 'missing in Django') is rather obviously the result of a botched move_or_rename
. I think I fixed this in 1.10.0.
In the other cases where there is an object in Django that is not in iRODs, provenance is not so clear. Translation: I might not have fixed whatever did this in 1.10.0. Best theory so far is that there was an attempt to upload something too large.
There was a general problem on www that -- since iRODs paths to files were manipulated in the application rather than the ResourceFile API -- the paths tended to get messed up by application programmers, which means that when one wants to delete the ResourceFile in iRODS, the path is incorrect and the delete fails in iRODs. I moved path handling to the ResourceFile API and this problem is gone in 1.10.0. This probably accounts for all "missing in Django" errors.
@alvacouch , the resources that our group (with @horsburgh ) created/modified with the API are all those with the naming convention "iUTAH GAMUT Network ..."
@alvacouch
During the ResourceFile fix, I discovered that the ResourceFile.delete (as used in the REST API) was leaving iRODS files behind. This is likely the cause of "files not in Django".
This the internal api (hydroshare.delete_resource_file()
) is being used both in REST API as well as the view function used by the UI. So I am not sure why in case of REST api only the files won't be deleted from iRODS.
From the iUTAH workflow description that I got from Kenny, it seems that his script tries to delete a file only if that file is reported to be in Django since the REST api for listing files for a resource generates the list from Django.
@pkdash The REST API is simply where I saw the error during testing. It is not necessarily the only expression of that error. But I saw it there and the tests for the REST endpoints are the ones that validate correct function.
I did not go back and put extra tests into hydroshare.delete_resource_file
(which, if we were adhering to best practices for naming, would be named ResourceFile.delete
instead of what is there now, which is a policy-free delete that bypasses those rules).
@AmberSJones @horsburgh May we have permission to clean up your resources by deleting files that are only present on one side (in Django or in iRODs but not both)? See detailed log for the identities of these files (in link before long listing).
@AmberSJones @Horsburgh Actually, the errors in your resources are a bit of a mystery to me. Your iRODs files are missing. I don't know how that could happen. Would it be possible to try and upload and see if you can reproduce such errors? I wonder why they're happening.
@alvacouch - yes, go ahead and clean any of the ones that start with "iUTAH GAMUT". These are all being automatically generated by Kenny's script, so even if we had to regenerate them, we could. There's a couple of other iUTAH-related ones in your list, but we may want to look at those more closely.
@alvacouch - and yes, Kenny is working right now on the code that generates/updates the files these resources. I'm hoping he figures the issue he's got there out soon so he can turn it back on and continue updating the files.
@ChristinaB Could you list the bags that won't download here? Do any of these correspond to the resources above (that are very, very broken)?
@ChristinaB I actually think that the things you're seeing might be different. I will start a new issue.
Starter list of resources (Tony will work on code) when I click on 'Download All Content as Zipped Bagit Archive. with the error "Please wait for the resource bag to be created....." , but never progresses or completes.
I generally agree with the approaches suggested above to resolve problems.
For files with a record entry in Django, but not in IRODS, there is no option for recovery, so we should just delete the file record entry in Django. In the tests I have done I have not found an error on the UI (not to say one is not there – I just did not find it). I have not found resources apart from the ones from Christina above where I am unable to download a file that appears on the UI.
For files in iRODS but with no corresponding resource in Django we should archive them somewhere then delete them from the system. Working from the Archive we should see if resourcemetadata.xml can tell us who the owner/creators were and the nature of the files. If I can get a list I can make a judgment call as to whether we need to try contact the user. It is likely that these are delete’s that failed so we will not have to do anything.
For files in iRODS with a resource in Django, but where the Django and iRODS file listings are different. Set the Django listings to be consistent with iRODS and regenerate bag and metadata files, or set the flag to false so these are regenerated on demand. I think that resources with “Files not in Django” and “files missing from Django and iRODs” are in this category, as checking a few of these on the UI, files do appear on the UI and are downloadable (at least for the few I checked).
I was able to go to resources listed as “iRODs resource missing” and download files and the bag. So I am not sure what this error indicates.
@dtarb @pkdash @hyi @mjstealey I am closing this issue with some final comments.
Most of the problems above were synchronization problems, solved by carefully synchronizing dumps of Django and iRODs. The remaining (real) problems were fixed on a one-off basis in PR #2090. There remains some concern that the UI is corrupting Christina's resources due to the complexity of what she asks the UI to do. See issue #2095 and PR #2100 for a beginning to debugging that.
It is not theoretically possible to synchronize the beta environment with production on federated resources. The reason for this is that federated resources cannot be "copied" to the test environment; they're too large. Thus the production and beta environments continue to modify the same federated resources asynchronously and they will get out of sync when that happens. Thus, tests of whether federated resources are synchronized on beta are not feasible. The mechanism in PR #2100 can be used to test whether they are corrupted in production.
@Castronova @ChristinaB See issue #2056 for remaining issue concerning bag download (which doesn't seem to have much to do with this issue).
A quick check of the 1.10.0 beta.hydroshare.org shows that iRODS and Django disagree on the names of some files. This may be snapshot skew, but we don't have the tools to check that.
One temporary solution is to write a file validator that checks whether iRODS and Django are synchronized. This can be utilized to track down the causes of any mis-synchronization.