Open fyliu opened 2 years ago
I think we talked about this as a team but I don't remember if we made a decision on this or we created this ER to work on it later.
Looking at the screenshots and knowing what I do about how the website processes VRMS data, I can see that having certain url fields in the project table would make it easier to retrieve that data for the website. But maybe it's better db design to have them in another table that's just for storing urls and the projects they're associated with.
In django ORM, we can use fetch_related() to retrieve this data in the same query to avoid the n+1 problem.
We will be removing the urls from the Project table. If the url type is not yet in the url_type spreadsheet tab, then we will add it. We're keeping both url_type and project_url. The ERD and SS need to be updated, as well as creating a new issue for updating the project table.
[fang: added as task items in the issue description]
I have some questions about the information for a project's google drive and github.
Current Github related fields: github_org_id github_primary_repo_id github_primary_url
Current Google Drive relared fields: google_drive_url google_drive_id
Do we want to keep github_primary_repo_id and google_drive_id in the project table? What is github_org_id?
@Neecolaa If you run this in a linux terminal, it will return a lot of github info about the project repo. Link to github API docs section
curl -L \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/hackforla/peopledepot
Full text of response hidden below:
I'm assuming these are the values we're intending to store. @ExperimentsInHonesty are these correct for peopledepot?
repo id (id): 429264142
org id (organization.id): 11635254
primary url (html_url): https://github.com/hackforla/peopledepot
I was able to confirm the github_primary_repo_id looking at Civic Tech Index's data in the website repo.
GitHub response hidden below
Projects like CTI has a parent org that's different from hackforla, so I guess it's useful to show that. But it also looks like the more important string to save is the full_name field (of the repo): "civictechindex/CTI-website-frontend". With it, an application can query github for the other information. I guess like Nicole said, we need to decide if there's a good reason to store the other pieces of data ourselves.
If there is, it seems like the user should be able to just enter the github_primary_url, and the frontend should be able to extract the full_name of the repo, make the API call, and populate the other pieces of data. This would serve as verification that the github_primary_url field is entered correctly.
We need the repo id and the url of the id.
However, a project could rename the repo and forget to update people depo with the new name. GitHub uses the old name as an alias unless the name get reused. Given that the repo names are in hackforla's github org, that is unlikely to be intentional.
We will need a script that runs occasionally (maybe weekly) that checks to see if the url of the repo match the id in github. if it doesnt match, it would create an issue with exception report.
Added discussion summary to issue description. We still need to discuss the google drive fields. Moving issue to Questions.
Current Github related fields:
project
project
for validating against the repo idCurrent Google Drive related fields:
project
?)project
?)I remember the settled solution being to remove the URLs and keep the IDs. The URLs would be in the project_url table
@Neecolaa Okay. Sounds like #208 is ready to review then. I will do that.
We looked back over the github and google drive id fields in project and propose the following structure changes:
to project_url:
add name
field: What the link is called Ex: Primary Repo, Secondary Repo, Description of "other" type urls
add external_id
field: resource id provided by github or google
to project:
update github_primary_repo_id
field: no longer the repo id itself, will now be a foreign key to a row in project_url that corresponds to the primary repo
update github_org_id
: no longer the org id itself, will now be a foreign key to a row in project_url that corresponds to the org repo link
update google_drive_id
: no longer the google drive id itself, will now be a foreign key to a row in project_url that corresponds to the google drive
@Neecolaa sounds like a good way to store the ids and also have quick references to these more essential project data. We will need an issue or 2 created for this:
@Neecolaa if you need someone else to make the issue, Add this to the next team agenda 2024-05-30 https://github.com/hackforla/peopledepot/issues/103#issuecomment-2128313897 or any agenda after that.
I think I created all the issues necessary to work on this ER. Should I close this issue or add the new issues as dependencies and move it to the ice box?
I created work issues relating to this and left the documentation step for this issue. I'm moving it to the ice box since this issue depends on the other ones.
Dependency
289
290
Overview
As system designers, we need to document our original intent so that users will be clear on how to use the project. We need to decide how project url fields and project_url entries relate, what should go where, and document it.
Action Items
external_id
field from this commentWe will be removing the url fields from the Project table. (From comment below)
Resources/Instructions
peopledepot
projectSummary of discussions
github_primary...
is because some projects have separate frontend and backend repo, for examplegithub_primary_repo_id
(id from the API response) andgithub_primary_repo_url
(html_url) in theproject
for validation purposes