Closed DS4B-ANU closed 5 months ago
To clean data, the 'automated' thing I've done now was use a raster (which essentially gives an outline of Australia) to remove any rows that don't have a latitude or longitude in Australia). This type of thing is probably meant to be done with shapefiles, I wouldn't know, but this worked.
I then manually came up with criteria to remove other rows (e.g. if it falls below a particular line, or is in Alice Springs).
https://github.com/nickboffa/DS4B-final-report/blob/c04be341048d1b54ee3993835867b56a56b3c4cb/final_project.Rmd#L64
one issue with SDMs is that false positives (i.e. records of a species where it doesn't actually occur) can really mislead the models.
Have a think about ways to automate data cleaning. You can't do it by hand if you plan to make this work for every species in the ALA. But you can have a look at the data and think about what does and doesn't make sense.
Typical issues are things like: