The analysis with the new land-use dataset is very interesting. I see you present the results only for the lake locations, but I could see in the resources document that the land-use was also extracted for the river survey locations. Any reason to not aggregate rivers and lakes results, as it was done with the old land-use dataset?
Homogenization of sampling conditions: Experience in the field has demonstrated a big difference in the sampling conditions between lakes and rivers. Therefore, lakes or rivers are taken as just another independent variable to account for.
The recommender (machine learning model) is initially being developed for lake locations. IT will be interesting to see how the output from the ML model compares to the Rho values for lake locations.
River samples are reported separately in the federal report. As the report evolves it just makes sense to do the same with any anlysis.
That being said it really depends on how you want to exress the effects of river sampling on survey results:
As a coefficient in an equation
As an independent model
For now it is simpler to make the models separately.
The analysis with the new land-use dataset is very interesting. I see you present the results only for the lake locations, but I could see in the resources document that the land-use was also extracted for the river survey locations. Any reason to not aggregate rivers and lakes results, as it was done with the old land-use dataset?