Closed dflynn-volpe closed 1 year ago
It's almost certainly a rounding error. I'll add some internal cross-checking...
Yep, here's a possible solution (using the real inputs):
# Here are the first few lines of splitInt():
SplitAll_ <- Props_ * Tot
SplitInt_ <- round(SplitAll_)
Rem <- sum(SplitAll_ - SplitInt_)
# Rem = 1, meaning that we are off by 1 and need to sample from the proportions to assign one DU to a land use type
sample(1:length(Props_), abs(Rem), replace = TRUE, prob = Props_)
# This is the problem, it produces no result:
# integer(0)
If we add a rounding step to ensure that our remainder is an integer, this succeeds:
Rem = round(Rem, 0)
sample(1:length(Props_), abs(Rem), replace = TRUE, prob = Props_)
# [1] 2
I will push a fix to development after testing that this works.
Slightly different. The right way to think about the inputs to splitInt() is that the 2-D array DU_BzHt_full
is applied to each 2-D slice of the array Props_BzHtLt
(one slice for each land use type).
The problem is that in the particular inputs I'm using, the remainder Rem
for the Town slice is not an integer! It comes out to 2.5. The sample()
function requires the size parameter (Rem) to be an integer. This might mean re-configuring this step so that it applies the number of dwelling units by land use type separately for each Bzone; currently it is set up to go across Bzones, within each land use type.
AssignLocTypes_TestData.zip
Fixing in patch_splitInt
Closed by 47af0b4
Reopening this based on failure in H-GAC 5000 BZone model: when the probabilities are equal (0.5 each in Urban and Town) and a small number of units is presented (3), the result rounds up and yields a wrong total. The difference is small (2 units) but trashes the tapply that comes up next since the LocType_Hx vector has an inconsistent length. I'll close this again once I've fixed that (probably simplest to enforce matching the requested total and just manually increment/decrement non-zero results to get it to match up)
Fix is in the pipeline
Fixed in development-next push 2023-03-24
Here is the splitInt() function in the AssignLocTypes module:
This works by feeding a dataframe of proportions of land types 'Lt' (urban, town, rural) by housing type 'Ht' (single family, multifamily, group quarters) for each Bzone 'Bz'. This is stored in the 3-dimensinal array Props_BzHtLt. Those proportions are then applied to the vector describing the number of housing types for each Bzone, stored in the 2-dimensional array DU_BzHt_full, here.
For one particular set of inputs I'm working with, this is now failing to correctly apply the splitInt, resulting in a mismatch between the length of the vector of names and housing IDs. The error occurs in the double loop here in AssignLocTypes.R.
Problem: The resulting DU_BzHtLt for this Bzone and housing type only sums to 984, but there are 985 dwelling units in this Bzone for the SF housing type. First, showing how it is supposed to work (and does, when manually feeding in data):
So what's the issue?