kaitlyngaynor / gorongosa-mesocarnivores

2 stars 0 forks source link

WHY IS SCALE() CREATING A TON OF NaNs? #106

Closed klg-2016 closed 3 years ago

klg-2016 commented 3 years ago

https://github.com/kaitlyngaynor/gorongosa-mesocarnivores/blob/5897986e7b1bfd281c208a80af8ce623051a5930/scripts/multi-season%20model/01-multi-season-data-prep.R#L136

The sample code from the colext() pdf suggests scaling the date matrix before inputting it into the model. They simply use the scale function, which I've also used before. However, when I try it here, it converts all values to NaNs. I was worried that it was doing so because the numbers were all so high (over 1700), so I tried to changed the Julian date origin date but that didn't fix it. Any thoughts on what's going on?

klg-2016 commented 3 years ago

okay it's because there's no variance in the columns. Am I doing that right? all the sites were "checked" on the same days, because we're counting each day as an individual survey, right? Should I just leave the dates as Julian dates then?

kaitlyngaynor commented 3 years ago

I think this issue has become moot since we don't need to use Julian date as a covariate. Closing for now

klg-2016 commented 3 years ago

Yes I think that's a good idea--you can tell how frustrated this issue made me by the excessive use of capitalization

kaitlyngaynor commented 3 years ago

I never looked into this, but my guess is that the cells were formatted as some kind of date format rather than numeric? You may have had to do as.numeric() first. But maybe not worth the headache of figuring it out if you don't need it. Just a tip for the future, that you should check the format of the data before applying a function to it, since some functions don't like all forms of data. Well-written functions will give you an error that says something like Error: function cannot be applied to date format, must be numeric but others will just silently propagate NAs!

klg-2016 commented 3 years ago

I think the issue here was the lack of variance in columns, because all "surveys" were conducted on the same dates. I got that far in figuring stuff out and then didn't know where to go from there.

My less generalized way of addressing that point was to look and see what format the data was that went into the sample code and figure out how to convert my data to the same format, only to see the same problem again. But the more generalized form of that advice is good to keep in mind. Would the function description (when you type ? + the function name) be the place to look for what format the input should be?

kaitlyngaynor commented 3 years ago

Hmm, well, I guess we can just leave this problem unsolved by now!

Would the function description (when you type ? + the function name) be the place to look for what format the input should be?

Yep. When you go to the help pane for scale(), for example, you see that the first argument is x and they specify that x is a numeric matrix(like object)

Dunno why it's written like that but I think it means a numeric matrix-like object, which means it is an object with rows & columns with numerical values in it.

klg-2016 commented 3 years ago

help pane, that's what it's called. Got it, thank you!