Closed saintsoup52 closed 8 years ago
Looking at the complaints file for UofP. The OPEIDs are only 7 characters long and are missing a leading "0" - when I make an ope6 from a opeid I right pad the id with 0s, I'll have to left pad them to add the missing 0 at the front. I am not sure yet if this is the cause, but it may explain it - and its a problem anyway.
Also - 10 of the closed and non-invalid complaints for UofP have no opeid.
Is that right?
I was using that column to maintain load-order independence so that the VA folks using the DS need not worry about the order in which they load CSVs. Using the crosswalk for this meaning I can no longer do that for complaints. I will have to move complaint logic around for this.
Sorry... this is a critical piece.
We have no way to update the OPE IDs in the complaint database so the longer we go the worse that list will get.
I just tested the updated data service... we are still getting different answers, but now I know why.
If the complaint does not have a facility code associated with it then it should be ignored.
Right now you are first looking for a facility code and if that is not present you are using the OPE code in the complaint file. While that is an ingenious solution, there is no way for us to keep those OPE IDs current. This is needs to be fixed in the complaint system, not the comparison tool.
I am going to point out to EDU that complaints should all be associated with a facility code because this is a larger problem overall (169 closed, valid complaints with no facility code).
Almost there... good work!!
Thanks Patrick - I'm nulling out the ope ids as I read them in so when a complaint is created in the complaints table, there will be no ope id when first saved.
Right before a new complaint is saved in the complaints table, I update the complaint categories (financial, quality, etc) for that complaint iff the status is closed and not invalid (method :ok_to_sum?). If the regex matches one of your category keywords, the corresponding "match" attribute is set to 1 (otherwise it remains as 0).
After all the complaints are loaded in, I sum the complaints by facility code (but only for those complaints with facility codes) and update of the facility code sums for the associated complaints. Lastly, when the data_csv is built (after CSV files have been uploaded) I then get and save the proper OPE id from the crosswalk using the complaint facility_code. Since null facility_codes have no crosswalk, no OPE id is saved for those complaints and hence should not be included in the ope roll ups. I hope this works ....
I am now getting 476 (I used the UoP online) with 237 complains for that facility_code. Also, the distribution of complaints in different categories are different. Could you send me the complaints file you use (if it isn't Comparison Tool-Complaint Data 20160330.csv)
I just did a hand count of the UoP online for the complaints file I have in excel - there are 237 complaints for UoP Online which matches what my local DS says, but not what the GIBCT in production says (239). I don't think I have the same complaints file that you do. I am going to do a roll up manually to see what I get.
By hand I've counted 474 for UoP rolled up by ope6. I have 476 and you have 480?
Including no responses in "reason" its now 475 rolled up for UoP ope6
Got it - I didn't check off UoP Philly in the filter. The DS now matches the complaints counts by fac code and ope6 I did by hand using the complaint csv that I have. Are our complaints csvs different?
The latest complaint file is uploaded to the data service and for UoP the rolled up complaint total should be 493.
Remember the filters when you check the numbers against the raw complaint file:
When I filter all schools with the name "University of Phoenix" and apply the filters above I get 493 just like I have in the latest db output.
As an fyi... the name for UoP Online changed in WEAMS so you can't just search for the name, you have to search for the facility code.
We are now off by 3 with the UoP roll up... this is just because you are using the OPE ID in the complaint file when there is no facility code, which we shouldn't be doing.
Patrick - This is what I'm doing:
After all complaints have been ingested and saved I call a method to sum the complaints by facility code. This is a snippet of that - it's a query that sums all complaints for non-null facility codes and puts the sums in complaints_financial_by_fac_code, complaints_quality_by_fac_code, and so on. It produces complaint sums by facility code for every institution having a facility code.
results = Complaint.select(:facility_code).select(select_strings.join(", ")) .where.not(facility_code: nil).group(:facility_code)
I'm nulling out all opes, then setting the complaints in categories (financial, quality, ...) to 0s and 1s for those complaints that are not invalid and closed, then summing by facility code for non-null facility codes, assigning the ope from the crosswalk, and computing the ope sums for non-null opes.
Could you look this over and see if and where the logic flaws are?
Now this is getting interesting - our sums differ by campus (facility code) as well: the UoP-online has 237 via the DS and 239 in production. I can't explain this - null facility_codes would not enter into this at all.
@saintsoup52 @rickleegit Sorry to drone on like this, but I think the issue isn't null fac codes, but we are getting different sums for the different categories for each campus:
I updated the complaint file in the DS.. someone reverted it back to an April file. maybe me... I hope not.
So are we good? I've done these computations by hand, and the nil facility_codes don't seem to be entering into this (c.f., step 3 above). I'm just so confused at this point LOL.
This certainly looks like a versioning issue. The numbers are too close to be programatic calculation errors, practically speaking.
On Fri, May 27, 2016 at 12:06 PM, Marc Harbatkin notifications@github.com wrote:
So are we good? I've done these computations by hand, and the nil facility_codes don't seem to be entering into this (c.f., step 3 above). I'm just so confused at this point LOL.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/department-of-veterans-affairs/gibct-data-service/issues/77#issuecomment-222186881, or mute the thread https://github.com/notifications/unsubscribe/AKn3Dgt__xP7gm0gn0j4WofcoUPR7NQ8ks5qFxZpgaJpZM4IkhLW .
This was a versioning issue. PEBKAC.
The total rolled up complaints for University of Phoenix is different between my spreadsheet and the digital service.
My database has 480 rolled up complaint, the digital service has 477 complaints.
I thought it may have to do with approved / non-approved schools but that doesn't seem to be the issue.
I put the complaints side by side and I can't figure out why.