Closed jclbrooks closed 5 years ago
A left join will duplicate everything based on the unique values within the common column (key). I will work on this today (offline) and push to github tomorrow.
Fixed by adding another column to landscape_occ called "transect#" . For the region "wmaryland" the stream name and transect number was separated into the columns "transect" and transect#", respectively. All other values from the Canaan, Shenandoah, or NCR region in the column "transect#" are NAs.
When I gathered the landscape_characteristics data, I had all of the WMD sites that were in each stream be matched with a single featureid in GIS because they were all so close together. This is causing problems though when I go to join landscape_characteristics (or landscape_vars) with landscape_occ because each transect is separate in landscape_occ but is grouped together by stream in landscape_characteristics. Is there an easy way to duplicate the rows in landscape_characteristics associated with each stream, so that there can be a row for each transect. For example, converting the single row with "Koch" as the transect into 7 rows whose transects are "Koch_1U", "Koch_2U", "Koch_1D", "Koch_2D", "Koch_3D", "Koch_4D", "Koch_5D" . Maybe using dplyr::slice to locate the rows (or indexing) and then rep() to repeat them??? I don't usually like using the location of the row because it can change so easily but it might be the best option. Is there a better way to do this?
I even thought of doing it by hand (I know.... taboo...) but it would have to be done after it was spread into landscape_vars, creating a new csv, but I haven't definitively decided which variables to use yet and it would be more of a headache to do it manually.... so that's a no go