Closed sbassett closed 2 years ago
@aj1s @tchapman100 I've hit a wall on this. It's in the nwland_dev_withWLIC branch.
here are the csvs (converted to xlsx because GitHub won't take CSVs in comments) produced for lc_ids and caland_df$Land_Cat_ID Test_Land_Cat_ID.xlsx Test_lc_ids.xlsx
Encountered this error again today with a slightly different code and different inputs.
in WLIC branch: GitHub\NWLAND\preproc\NWLAND_proc_iesm_climate_v2.r
Error thrown by > scalar_out[v, dl, totyind, lc_inds] = lc_vals
[line 747]
Error in scalar_out[v, dl, totyind, lc_inds] <- lc_vals :
NAs are not allowed in subscripted assignments
both lc_ids
and caland_df$Land_Cat_ID
are double
lc_ids is a "named number"
caland_df$Land_Cat_ID is just a plain number
> str(caland_df$Land_Cat_ID)
num [1:6293] 8.08e+08 8.08e+08 8.08e+08 8.08e+08 8.08e+08 ...
> str(lc_ids)
Named num [1:59] 8.00e+08 8.01e+08 8.01e+08 8.02e+08 8.02e+08 ...
- attr(*, "names")= chr [1:59] "800304000" "800704000" "801304000" "801504000" ...
maybe try this: https://stackoverflow.com/questions/15736719/how-do-i-extract-just-the-number-from-a-named-number-without-the-name
even matching with unname still doesn't work
> namless_lc_ids <- unname(lc_ids)
> str(namless_lc_ids)
num [1:59] 8.00e+08 8.01e+08 8.01e+08 8.02e+08 8.02e+08 ...
> namless_lc_inds = match(namless_lc_ids, caland_df$Land_Cat_ID)
> str(namless_lc_inds)
int [1:59] NA NA NA NA NA NA NA NA NA NA ...
New hypothesis from @aj1s: the num represented by scientific notation (e.g. 3.03e+08) is throwing the match() off.
Can try to convert the nums to ints using as.integer(X)
.
This would work, except for the ridiculous 32-bit representation of integers that limits the number of values to around 2 billion.
> int_lc_ids <- as.integer(lc_ids)
Warning message:
NAs introduced by coercion to integer range
Will try to match on text strings.
> tail(char_lc_ids)
[1] "3503904000" "3504504064" "3504704000" "3505304064" "3505504000" "3506104064"
> char_matchtest_single <- match("800304000", char_thing)
> (char_matchtest_single)
[1] NA
> char_matchtest_single <- match("3503904000", char_thing)
> (char_matchtest_single)
[1] NA
> which(char_lc_ids %in% char_thing)
integer(0)
> char_lc_ids %in% char_thing
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[27] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[53] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
No dice!
Attempting to match a character representation of the thing to itself to see if produced valid output.
> char_matchtest_self <- match(char_thing, char_thing)
> char_matchtest_self
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3
Self match works on both files. Now considering the easiest explanation (that there aren't actually matches).
There are indeed no matches.
@aj1s you were right to be skeptical, I shouldn't have trusted my prior review on a different dataset.
At least in the char_lc_ids and char_thing, there are no matches as tested by exporting CSVs and "find"ing values on Excel from char_lc_ids in char_thing.
It appears that land ownership codes are getting messed up in the lc_ids vector. The values in that vector end in either '000' or '064'. Neither code is valid (see https://github.com/TNC-NMFO/NWLAND/issues/83#issuecomment-930469450).
I'm curious if its a result of the 32-bit integer problem. I'll assign a twodigit code for each county/region (since there are only 98 of them), and reproduce the landcat grid.
merge() error resolved with two-digit county/region codes. New error received, will open new issue.
end raster processing year 2010 ; l 9 Forest ; v 2 Soil Tue Oct 05 00:14:30 2021
Error in `$<-.data.frame`(`*tmp*`, "Component", value = "Soil") :
replacement has 1 row, data has 0
lc_inds
lc_inds = match(lc_ids, caland_df$Land_Cat_ID)
unname()
on the 'lc_ids' but it didn't help make the matches work...