Open joey-ma opened 1 year ago
At the meeting, Chelsey suggested recording this metadata in a data catalog, which is a tool intended to be used by data professionals (data engineers, data scientists, data stewards, and chief data officers). It doesn't directly have a benefit for the software engineering side of the project, but if it's not too much work, I think we should attempt to fill in this document in the spirit of striving to be a better project.
I was personally thinking to make note of things in maybe the create table issues, which helps guide the developer when they work on them, and then the developer should copy or write a more appropriate version of it as documentation that becomes part of the PR along with the code (Documentation as code). I don't know if it makes sense for the data catalog to be part of the documentation that's in the code repository, but it would make it easier for the developer to work on it since it's accessible in VS Code just like the documentation and the source code. The developer would just need to do an extra copy/paste to the data catalog file without leaving VS Code or whatever editor they use. I'm certain we will have search functionality in whatever documentation system we choose to use.
On the other hand, having a standalone document might be easier for a data person to consume.
We need more context for the table issues, to note the scenarios of how the table is being used.
Comment: I agree, and I assume this information is for the benefit of the developer working on the issue of implementing the model?
Some table names don't precisely reveal their intended use, like role
.
Comment: It's good to point out these things and I think the solution is to rename them to be more specific whenever possible.
We need example data for the models.
Comment: I can see how this can be converted into unit tests. Having agreement on which fields are nullable/blankable helps. Currently, for me, I assume most fields can be empty (e.g. optional). The reason is that a client using the API would be able to use it without including all the fields. This is actually not great design. One of the dangers is that we're providing a data field that no client actually use or need, and we don't know that we could remove it. The better way (best practice) is to implement only the fields that some client asks for (will use). This is something we need to discuss and decide on as a team. My original assumption was based on the thought that "VRMS UI don't be caught up to be able to provide all the model's data when they integrate".
Example 1 user stories
Comment: I assume this benefits the developer. But it feels like something that's not readily consumable for the developer. Further developing them into use cases might be a better representation format.
For Scenario 1, I think a more useful representation would be this:
Scenario 2 might be something like this:
Scenario 3 could be like this:
Comment: So this can become 5 use cases with 2 actors (guest, logged in user). These can translate directly into acceptance criteria for use in tests.
Example 2
It's not mentioned in the user story, but I assume this is about the role table.
Types of questions:
Comment: I think its great to ask questions and work out a reasonable example to draw out any unwritten requirements or come to a better/clearer design. I think all the questions should be made into task items for discussion. Then we can check them off when we're done with them. Note any decisions that were made or new issues stemming from them.
- We need more context for the table issues, to note the scenarios of how the table is being used.
Comment: I agree, and I assume this information is for the benefit of the developer working on the issue of implementing the model?
Definitely. A descriptive prompt with different examples help developers better understand the problem and therefore can focus more energy on coming up with a solution. In our case, we have a general understanding of our goal, but it'd be hard to be caught up on all previous discussions among a cross-functional team. I definitely understand it's an iterative process and we are getting closer and closer to our goal as we go, but writing down the context and examples for the table issues helps to document the requirements, even if the note itself is improved incrementally.
- Example 1 user stories Comment: ...
We can start somewhere and improve it incrementally as requirements become more and more clear. Certainly more examples / scenarios help developers come up with the solution that meets the expectation(s).
I think all the questions should be made into task items for discussion. Then we can check them off when we're done with them. Note any decisions that were made or new issues stemming from them.
💯. And having "user stories" and "tables" (that help illustrate scenarios or expected data) are meant to help convert clarifying questions into actionable steps. 🙏🏻
I'd think the team implementing the front end would write the user stories. Our direct users would be the teams using the API (or APIs if we do a Backend For Frontend), as we don't have our own front end outside of that.
When we have an idea of how something will work, we can write the user stories and ask for verification from the given group. So for the check-in example, VRMS would be the ones either creating the user stories, or verifying that the stories we wrote were correct.
I did find the user story table a little difficult to read. Are the examples listed for Scenario 1 examples of meeting titles/types?
I'd think we'd be the ones who'd write the queries based off what stakeholders want (like what's in #2).
Would writing views be helpful here? Really they're just stored queries. Implementing views in django
The data dictionary Chelsey started will be great for this. The fields tab in the PD: Table and field explanations has some information we can transfer to the data dictionary, but it's incomplete.
For tables where we have the data already, I have added link symbols to the right corner of the table's header. Clicking that will open up the data tab for that table in the PD spreadsheet. Not as convenient as the data dictionary will be, but still something for now.
Which is the example for the XREF table? I'm most familiar with xref tables being like this, but I don't think that's what you mean here.
If xref refers to some other common idea, I have a potential concern:
Currently, we use xref in our table names whenever we have a table implementing a many-to-many relationship (ex: project_language_xref
stores the relationships between many projects and the many languages they use). Should we change this? Initially, we used "join" (ex: project_language_join
), but changed it to avoid the JOIN keyword. This example doesn't use a keyword, but I like having one so we can find these types of tables easier. Maybe "link"?
I started to write something more and it turned out really long. I'll shorten it here by leaving out the details.
The API abstraction is there for a reason. The backend mentality should be to work towards the API.
It's a mistake to work on user stories backwards from tables in the backend. User stories is on the opposite side of the API from the backend.
The frontend needs to develop and keep all the user stories and work with the backend to come up with an API that satisfies the data usage requirements.
Just having the table definitions like we have isn't an API. We need to establish actual endpoints and a set of behaviors as acceptance criteria.
This makes me want to recommend a break from our current workflow and work first on specifying the API with the frontend team, which is VRMS right now.
User stories are not appropriate to include in table issues for other reasons.
@yoyoyojoe lets chat briefly about this tomorrow....
Summary is we have ways of doing what you are asking for, the blocker here is my and nicole's time.
How to manage this issue
order of people
Details
Hi y'all!
With Bonnie’s help we are quickly prioritizing issues and we’re getting more clarity for a lot of issues, but I wonder if a little bit more notes (especially as it relates to understanding our requirements) could help with everyone being on the same page.
Suggestion: Not hoping to make the process more laborious, I personally am noticing a pattern and would like to just propose an idea (or 2) to help make things more clear:
Reasoning:
ERD
describes a complex initiative with several user stories and subtasks, and thePD: Table and field explanations
is providing more details in a spreadsheet format. However, it takes quite a bit of effort (for me, at least) to remember how a table is being used in what scenarios, and oftentimes the spreadsheet also doesn’t have all the details documented. This might also be due to multiple things happening asynchronously and slowly, yet the context for each issue is often different, but it’s not being written in the comments.Suggestion:
An example but using unrelated data:
We have ERD (what's on the left), but we don't have the expected data fields (bottom right, which is something similar to the table that I am thinking about, it also doesn't have to be perfect), and eventually, based on what we expect to get, the backend team can help with the actual query (upper right). Even if the frontend is to come up with the query string, knowing the expected fields and data with examples can still be helpful.
Example 1: (a simplified example, but maybe not the best example)
Example 2 (for current issue):
Job_Field
orJob_Department
orFunctional_Area
or "Competencies"?role
table referring to "Software Engineer" (title) or "Front-end Development" (position / area of responsibility), or "Tech Lead" (leadership)? Noteleadership_type
andpermission_type
androle
are all each a table.Here is my current understanding:
I'm imagining: "Fang" who has the
role
of a "Software Engineer", with theleadership_type
of "Tech Lead" forproject
"PeopleDepot" withpermission_type
"PeopleDepot Admin" will go to "somewhere_page" to enteruser.name_first
anduser.name_last
"Joey Ma" (which will query from all list of users, or all users withinproject
"PeopleDepot", and upon selecting thisuser
(byuser.id
), Fang can then change myleadership_type
to "Interim Tech Lead" andpermission_type
"PeopleDepot Admin"...However, this is more of a guess than an actual understanding of the requirements. Having an actual example would be nice. At this time, I'm actually not quite sure how an "admin user" will be able to assign permission to another user. Someone else could have a different version of what the user is expected to see, select, and enter.
While this user story is not considering quite everything, having some written examples of user stories will help with remembering what the devs are developing for and thereby how to optimize code and db design. The dev team will eventually need some example data to appropriately come up with the each query, and clear up what the best data type to store the data would be.
some table that helps to illustrate the the relationship
Originally posted by @yoyoyojoe in https://github.com/hackforla/peopledepot/issues/142#issuecomment-1483632903