I have a rather unique dataset, multiple millions of GPS locations in a constrained environment making determination of categorical habitat availability pretty straightforward and which are summarized in a frequency table. I can reasonably assume that all habitats are available given the movement capabilities of the species studied and generating 10-fold random selections to contrast with the used data would result in over a hundred million records which is beyond my compute capacity.
Is it appropriate to run rsf or rspf functions using a weights function that is =1 for used cases and =Frequency of occurrence for available cases? This would essentially limit the rows of available data to 1 per class.
The simulation I just ran produces identical estimates whether I use weights = Freq or not, the Standard errors are different, but I expect that is due to just using a few bootstraps as a trial run.
I have a rather unique dataset, multiple millions of GPS locations in a constrained environment making determination of categorical habitat availability pretty straightforward and which are summarized in a frequency table. I can reasonably assume that all habitats are available given the movement capabilities of the species studied and generating 10-fold random selections to contrast with the used data would result in over a hundred million records which is beyond my compute capacity.
Is it appropriate to run
rsf
orrspf
functions using a weights function that is =1 for used cases and =Frequency of occurrence for available cases? This would essentially limit the rows of available data to 1 per class.The simulation I just ran produces identical estimates whether I use
weights = Freq
or not, the Standard errors are different, but I expect that is due to just using a few bootstraps as a trial run.