Closed bella004 closed 7 years ago
A few observations;
.sql
files are to be in the directory /src/db
: see e-mail on this topicSET LOCAL SCHEMA
everywhere; instead qualify all function and table names with schema nameONLY RUN THE QUERY AFTER IMPORTING ROSTER AND OPENCLOSE CSVS
:
getAttendance(SectionID INTEGER)
sectionID
: start function/variable/parameter names with lower case(SELECT ID FROM curSec)
seems unnecessary: could be replaced with $1
sdar.i
is required in the GROUP BY
clausegetAttendance
. See comment here.getSectionID
SeasonName
is inconsistent with its use: call it just season
and add a comment explaining the possible domains the parameter referencesgetAttendance
This raises some questions:
.sql
files are to be in the directory /src/db
" There is no such directory on my branch. It's just /src
SET LOCAL SCHEMA
everywhere; instead qualify all function and table names with schema name" So I got #34 backwards?getAttendance
" So, change getSectionID
to getAttendance
and getAttendance
to something else?/src/db
directory in this branch in your local repo and move the file to that directory. Then commit and push changes. There is already a /src/db
directory in the master on the server.getAttendance
which returns attendance. (Postgres supports function overloading similar to C++ and Java.) However, it does not need to implement the entire functionality. Instead, it only needs to determine the section ID for the given parameters and then call the 1-arg version. The code to determine section ID is in function getSectionID
in this PR. It is OK if that function is renamed getAttendance
and then enhanced to return attendance.
getSectionID
and reuse it in the 4-arg version of getAttendance
. However, as of now, getSectionID
would be useful only in the 4-arg version of getAttendance
. So, it is better to just rename/revise getSectionID
to become the 4-arg version of getAttendance
.I just ran the file and it produced an error: column "st.lname" must appear in the GROUP BY clause or be used in an aggregate function
. This is because I'm using st.lname
, st.fname
, and st.mname
in the SELECT
, so would it be preferred if I remove those from the SELECT
, or put them back in the GROUP BY
and remove the sdar.i
from the GROUP BY
, because I know I'm not using that in the SELECT
.
It is possible I am not seeing something that is really obvious (thus my use of the phrase "I feel" originally). Let me know what I am missing.
Anyway, I wrote the following script to test what I understand the scenario is. (I had to create my own tables because I don't have a functioning Gradebook DB right now).
The SELECT
query here groups by S.ID
but outputs S.FName
and S.LName
. This is possible because the name columns are used in expression involving the aggregate function string_agg
.
CREATE TEMP TABLE IF NOT EXISTS S (ID INTEGER PRIMARY KEY, FName VARCHAR, LName VARCHAR);
CREATE TEMP TABLE IF NOT EXISTS A (SID INTEGER REFERENCES S, Status VARCHAR);
TRUNCATE S CASCADE;
INSERT INTO S VALUES(1, 'Joe', 'Baker');
INSERT INTO S VALUES(2, 'Mary', 'Anne');
INSERT INTO A VALUES(1, 'A');
INSERT INTO A VALUES(1, 'P');
INSERT INTO A VALUES(1, 'X');
INSERT INTO A VALUES(2, 'P');
INSERT INTO A VALUES(2, 'E');
INSERT INTO A VALUES(2, 'P');
SELECT S.LName || ',' || S.FName || ',' || string_agg(Status, ',') AS csv
FROM S JOIN A ON S.ID=A.SID
GROUP BY S.ID;
This script produces the following result:
csv |
---|
"Baker,Joe,A,P,X" |
"Anne,Mary,P,E,P" |
I am sure other reviewers will have thoughts as well on my comments and this PR in general.
It should now work. Apparently, it was okay to use only st.id
in the GROUP BY
, but not sdar.i
The changes made so far look good.
One small issue that was mentioned by @smurthys has not been addressed:
Add product name after file name in the prologue comment: see ClassDB source files for example
Solution: Add - Gradebook
on line 1 after the file name. This makes it consistent with the prologue of other files. As a side note, some files have capitalized Gradebook
as GradeBook
. However, we are using the first capitalization (lowercase 'b').
Apart from that, I noticed a couple of potential minor changes that should be considered:
Line 81 uses csv_data
and an alias for the first row (csv header). Line 84 uses csv
an alias for a csv row of a student and their attendance codes. It is not necessary to use an alias here, since the one given during the function declaration, dataStr
(line 57), will be used. However, as it currently stands, the first alias is misleading to someone reading the code. (Perhaps it should be csv_header
?)
The getAttendance
function declares and uses a variable named sectionID
. However, this may not be necessary. Consider avoiding the declaration and use of a variable by just using the value of N.ID
when calling createAttendancePivot
. Doing so will allow the function to use sql
instead of plpgsql
, which is usually desirable in terms of efficiency and portability.
On line 71 , sub queries (i.e. SELECT x FROM y
) are used to pick a value to call datesFromSchedule
with. Is it necessary to do so? For instance, does using curSec.sd
instead of SELECT sd FROM curSec
cause an error? Avoiding the use of sub queries here will make it easier to interpret this line. If we do this, then we will also have to add curSec
to the FROM clause.
Generally, we should try to be consistent in terms of situations where we use a capital letter vs a lowercase letter, even if they do not cause an issue. Doing so increases the ease of reading the code, and prevents issue when they do happen to become case sensitive. At the very least, we should be consistent with the existing naming convention in a particular file. A couple of situations where capitalization is not consistent in getAttendance.sql
are with table names and alias names. It seems like most of the time, the first letter of a table name is capitalized, and that single letter aliases are lowercase, however there seem to be a few cases where that is not being followed.
Similar to the previous remark, there are situations where spacing before and after an equals sign is not consistent.
Thanks for your help with this task.
(Updated to take Andrew's comment into consideration)
Hi Kyle,
I did some testing on getAttendance.sql
while updating the web server, and I have a few observations:
createAttendancePivot
should be called getAttendance
, since it is the one-argument overload of the other getAttendance
function.datesFromSchedule
should also be qualified with gradebook.
.getAttendance
should also return a one column table like the other getAttendance
.csvwheadattnrec
). We can discuss changing this if you want, but the server accesses this column by name, so it is important they match.RETURN QUERY
must be placed above a select query returning a value in a plpgsql function. There's some more info on this in the plpgsql docs if you're interested. However, this is not necessary if you change the function to use SQL, like Andrew mentioned.With these couple tweaks, these functions should be go to go as far as the server is concerned.
"On line 71 , sub queries (i.e. SELECT x FROM y
) are used to pick a value to call datesFromSchedule
with. Is it necessary to do so? For instance, does using curSec.sd
instead of SELECT sd FROM curSec
cause an error? Avoiding the use of sub queries here will make it easier to interpret this line. If we do this, then we will also have to add curSec
to the FROM clause."
I would do that if I didn't have to JOIN curSec
to the FROM clause, since I would:
curSec
on, which I don't think there is, orScratch that, I found a way to justify the second cross join if it's in a separate WITH clause.
This script is shaping up really well, thanks to @bella004's persistence and the reviewers' diligence.
In addition to the earlier review comments, I propose the following changes:
getAttendanceStub.sql
, because it is not a complete test. (We have yet to discuss organization of unit tests.) The contents of that file are perhaps better saved in a gist.csvwheadattnrec
to AttendanceCsvWithHeader
, because the current name is not in camel case and is not descriptive. (Unlike function/variable/parameter names, table and column names begin with an upper case letter.) This change will trigger an edit in the web server (which is yet to merged).On renaming the result column csvwheadattnrec
to AttendanceCsvWithHeader
: I don't know for sure, but it is possible this renaming conflicts with JSON naming (conventions) in the web server. However, almost any DB team is virtually guaranteed to disregard coding conventions of teams that use the DBMS. (They will of course reconsider names if they conflict with a protocol or other operational aspects.)
The getAttendance function is now more modular only needing sectionID The getSectionID function returns the sectionID that getAttendance needs
Note: Make sure all imports are made before running