TexasDigitalLibrary / dataverse-reports

Statistical reports for Dataverse (https://dataverse.org/) that can support a federated instance.
MIT License
7 stars 4 forks source link

Need some help understanding recursion used here #12

Closed eunices closed 3 years ago

eunices commented 3 years ago

Hi! Not sure if this is the right avenue to ask, but can I check why recursion was used in the code?

e.g., load_users_recursive load_user_dataverse

I'm running the python program hitting the api of a local dataverse instance. I have 3 users registered but the report indicates 1 user, and am having some trouble understanding why this might be so.

I'm only running the users.csv report for run.py

            user_report = user_reports.report_users_recursive(dataverse_identifier='root')
            user_report_file = output.save_report_csv_file(output_file_path=work_dir + 'users.csv', headers=user_fieldnames, data=user_report)
            csv_reports.append(user_report_file)

But the log outputs Dataverse related things as well

Loading configuration from file: %s config/application.yml
2021-09-16 16:27:35,412 - INFO: Attempting to connect to Dataverse database: localhost (host), dvndb (database), dvnapp (username) ******** (password).
2021-09-16 16:27:35,503 - INFO: Retrieving all Dataverse users...
2021-09-16 16:27:35,528 - INFO: Loaded 3 users.
2021-09-16 16:27:35,528 - INFO: Retrieving all Dataverse users...
2021-09-16 16:27:35,545 - INFO: Loaded 3 users.
2021-09-16 16:27:35,545 - INFO: Started creating reports...
2021-09-16 16:27:35,545 - INFO: Generating reports from the root dataverse
2021-09-16 16:27:35,545 - INFO: Begin loading users for root.
2021-09-16 16:27:35,545 - INFO: Loading dataverse: root.
2021-09-16 16:27:35,545 - INFO: Adding contact of dataverse to report: root
2021-09-16 16:27:35,568 - INFO: Dataverse name: Root
2021-09-16 16:27:35,568 - WARNING: Unable to find user from dataverseContact email: root@mailinator.com
2021-09-16 16:27:35,588 - INFO: Total dvObjects in this dataverse: 11
2021-09-16 16:27:35,589 - INFO: Found new dataverse 2.
2021-09-16 16:27:35,589 - INFO: Loading dataverse: 2.
2021-09-16 16:27:35,589 - INFO: Adding contact of dataverse to report: 2
2021-09-16 16:27:35,601 - INFO: Dataverse name: DataverseNO
2021-09-16 16:27:35,602 - WARNING: Unable to find user from dataverseContact email: dataverse-sample-data@mailinator.com
2021-09-16 16:27:35,615 - INFO: Total dvObjects in this dataverse: 0
2021-09-16 16:27:35,615 - INFO: Found new dataverse 3.
2021-09-16 16:27:35,615 - INFO: Loading dataverse: 3.
2021-09-16 16:27:35,615 - INFO: Adding contact of dataverse to report: 3
2021-09-16 16:27:35,627 - INFO: Dataverse name: Open Source at Harvard
2021-09-16 16:27:35,627 - WARNING: Unable to find user from dataverseContact email: philip_durbin@harvard.edu
2021-09-16 16:27:35,638 - INFO: Total dvObjects in this dataverse: 2
2021-09-16 16:27:35,639 - INFO: Found new dataverse 4.
2021-09-16 16:27:35,639 - INFO: Loading dataverse: 4.
2021-09-16 16:27:35,639 - INFO: Adding contact of dataverse to report: 4
2021-09-16 16:27:35,649 - INFO: Dataverse name: The Dataverse Project
2021-09-16 16:27:35,649 - WARNING: Unable to find user from dataverseContact email: philip_durbin@harvard.edu
2021-09-16 16:27:35,662 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,663 - INFO: Found new dataverse 6.
2021-09-16 16:27:35,663 - INFO: Loading dataverse: 6.
2021-09-16 16:27:35,663 - INFO: Adding contact of dataverse to report: 6
2021-09-16 16:27:35,673 - INFO: Dataverse name: Eleni Castro Dataverse
2021-09-16 16:27:35,674 - WARNING: Unable to find user from dataverseContact email: support@dataverse.org
2021-09-16 16:27:35,684 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,684 - INFO: Found new dataverse 19.
2021-09-16 16:27:35,684 - INFO: Loading dataverse: 19.
2021-09-16 16:27:35,684 - INFO: Adding contact of dataverse to report: 19
2021-09-16 16:27:35,694 - INFO: Dataverse name: Manchester Dataverse
2021-09-16 16:27:35,694 - WARNING: Unable to find user from dataverseContact email: support@dataverse.org
2021-09-16 16:27:35,709 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,710 - INFO: Found new dataverse 34.
2021-09-16 16:27:35,710 - INFO: Loading dataverse: 34.
2021-09-16 16:27:35,710 - INFO: Adding contact of dataverse to report: 34
2021-09-16 16:27:35,718 - INFO: Dataverse name: HCPDS Dataverse
2021-09-16 16:27:35,719 - WARNING: Unable to find user from dataverseContact email: support@dataverse.org
2021-09-16 16:27:35,732 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,732 - INFO: Found new dataverse 39.
2021-09-16 16:27:35,733 - INFO: Loading dataverse: 39.
2021-09-16 16:27:35,733 - INFO: Adding contact of dataverse to report: 39
2021-09-16 16:27:35,745 - INFO: Dataverse name: CMS Dataverse
2021-09-16 16:27:35,745 - WARNING: Unable to find user from dataverseContact email: support@dataverse.org
2021-09-16 16:27:35,757 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,757 - INFO: Found new dataverse 53.
2021-09-16 16:27:35,757 - INFO: Loading dataverse: 53.
2021-09-16 16:27:35,757 - INFO: Adding contact of dataverse to report: 53
2021-09-16 16:27:35,766 - INFO: Dataverse name: ScholCommLab's Dataverse
2021-09-16 16:27:35,766 - WARNING: Unable to find user from dataverseContact email: scholcommlab@mailinator.com
2021-09-16 16:27:35,780 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,780 - INFO: Found new dataverse 57.
2021-09-16 16:27:35,780 - INFO: Loading dataverse: 57.
2021-09-16 16:27:35,780 - INFO: Adding contact of dataverse to report: 57
2021-09-16 16:27:35,792 - INFO: Dataverse name: Ubiquity Press Dataverse
2021-09-16 16:27:35,792 - WARNING: Unable to find user from dataverseContact email: ubiquity-press@mailinator.com
2021-09-16 16:27:35,802 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,802 - INFO: Found new dataverse 58.
2021-09-16 16:27:35,802 - INFO: Loading dataverse: 58.
2021-09-16 16:27:35,802 - INFO: Adding contact of dataverse to report: 58
2021-09-16 16:27:35,814 - INFO: Dataverse name: Journal of Open Psychology Data (JOPD) Dataverse
2021-09-16 16:27:35,815 - WARNING: Unable to find user from dataverseContact email: jopd@mailinator.com
2021-09-16 16:27:35,829 - INFO: Total dvObjects in this dataverse: 2
2021-09-16 16:27:35,829 - INFO: Found new dataverse 70.
2021-09-16 16:27:35,829 - INFO: Loading dataverse: 70.
2021-09-16 16:27:35,829 - INFO: Adding contact of dataverse to report: 70
2021-09-16 16:27:35,840 - INFO: Dataverse name: Gary King Dataverse
2021-09-16 16:27:35,841 - WARNING: Unable to find user from dataverseContact email: king@mailinator.com
2021-09-16 16:27:35,850 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,850 - INFO: Found new dataverse 74.
2021-09-16 16:27:35,850 - INFO: Loading dataverse: 74.
2021-09-16 16:27:35,850 - INFO: Adding contact of dataverse to report: 74
2021-09-16 16:27:35,863 - INFO: Dataverse name: Dataverse Admin Dataverse
2021-09-16 16:27:35,875 - INFO: Total dvObjects in this dataverse: 2
2021-09-16 16:27:35,875 - INFO: Found new dataverse 76.
2021-09-16 16:27:35,875 - INFO: Loading dataverse: 76.
2021-09-16 16:27:35,875 - INFO: Adding contact of dataverse to report: 76
2021-09-16 16:27:35,884 - INFO: Dataverse name: Dataverse Admin Dataverse
2021-09-16 16:27:35,939 - INFO: Total dvObjects in this dataverse: 0
2021-09-16 16:27:35,940 - INFO: Found new dataverse 77.
2021-09-16 16:27:35,940 - INFO: Loading dataverse: 77.
2021-09-16 16:27:35,940 - INFO: Adding contact of dataverse to report: 77
2021-09-16 16:27:35,950 - INFO: Dataverse name: Anyone
2021-09-16 16:27:35,964 - INFO: Total dvObjects in this dataverse: 1
2021-09-16 16:27:35,964 - INFO: Finished loading 3 users for root
2021-09-16 16:27:35,964 - INFO: Saved report to CSV file /tmp/users.csv.
2021-09-16 16:27:35,964 - INFO: Creating Excel file: /Users/rdm/reports/dataverse-reports.xlsx
2021-09-16 16:27:35,975 - INFO: Saved report to Excel file /Users/rdm/reports/dataverse-reports.xlsx.
2021-09-16 16:27:35,975 - INFO: Finished saving Excel file to /Users/rdm/reports/dataverse-reports.xlsx.
2021-09-16 16:27:35,976 - INFO: Finished processing reports.

users.csv:

"id","userIdentifier","firstName","lastName","email","affiliation","position","isSuperuser","roles","createdTime","lastLoginTime"
1,"dataverseAdmin","Dataverse","Admin","dataverse@mailinator.com","Dataverse.org","Admin",True,"Admin, Contributor","2021-09-13 09:52:22.264","2021-09-16 15:21:57.998"

GET http://localhost:8080/api/admin/list-users/?key=<key>&unblock-key=<key2>&selectedPage=1

{"status":"OK","data":{"userCount":3,"selectedPage":1,"pagination":{"isNecessary":false,"numResults":3,"numResultsString":"3","docsPerPage":25,"selectedPageNumber":1,"pageCount":1,"hasPreviousPageNumber":false,"previousPageNumber":1,"hasNextPageNumber":false,"nextPageNumber":1,"startResultNumber":1,"endResultNumber":3,"startResultNumberString":"1","endResultNumberString":"3","remainingResults":0,"numberNextResults":0,"pageNumberList":[1]},"bundleStrings":{"userId":"ID","userIdentifier":"Username","lastName":"Last Name","firstName":"First Name","email":"Email","affiliation":"Affiliation","position":"Position","isSuperuser":"Superuser","authenticationProvider":"Authentication","roles":"Roles","createdTime":"Created Time","lastLoginTime":"Last Login Time","lastApiUseTime":"Last API Use Time"},"users":[{"id":1,"userIdentifier":"dataverseAdmin","lastName":"Admin","firstName":"Dataverse","email":"dataverse@mailinator.com","affiliation":"Dataverse.org","position":"Admin","isSuperuser":true,"authenticationProvider":"BuiltinAuthenticationProvider","roles":"Admin, Contributor","createdTime":"2021-09-13 09:52:22.264","lastLoginTime":"2021-09-16 15:21:57.998","lastApiUseTime":"2021-09-16 17:07:29.641","deactivated":false},{"id":3,"userIdentifier":"xxx","lastName":"xxx","firstName":"xxx","email":"xxx@xxx.com","isSuperuser":false,"authenticationProvider":"BuiltinAuthenticationProvider","roles":"Curator","createdTime":"2021-09-16 10:29:06.533","lastLoginTime":"2021-09-16 14:13:13.224","deactivated":false},{"id":2,"userIdentifier":"xxx","lastName":"xxx","firstName":"xxx","email":"xxx@xxx.com","isSuperuser":false,"authenticationProvider":"BuiltinAuthenticationProvider","roles":"","createdTime":"2021-09-13 10:51:57.686","lastLoginTime":"2021-09-16 11:36:40.603","lastApiUseTime":"2021-09-13 11:09:55.229","deactivated":false}]}}

Hope to understand

Thanks

eunices commented 3 years ago

@nwoodward Hi! Even though I've closed this issue, any comments about the post above would be helpful. Thank you!

nwoodward commented 3 years ago

@eunices Hey! The reason that this code for creating a list of users traverses the dataverse/dataset tree has to do with how we have set up our Dataverse instance. In our case, each top-level dataverse belongs to a different institution. So in order to keep the user lists separate the code iterates recursively down the dataverse tree using the metadata from each dataset to create a users list for each institution. One ancillary benefit of this approach, even in the case of Dataverse instances that aren't federated like ours, is that the lists are of active users.

The list users API endpoint that you referenced (api/admin/list-users/) is also useful for creating a list of users, regardless if they have ever created a dataverse/dataset or not. The main issue we had with it is that the affiliation field is plain text and not tied to another database table. So people enter different values for the same institution, and we had a hard time grouping them together in any meaningful way.

Does this make sense? Let me know if I'm missing the point of your questions or if I can explain things further.

eunices commented 3 years ago

@nwoodward Thank you for your explanation, it's very clear. :)