baktoft / yaps

YAPS - Yet Another Positioning Solver
GNU General Public License v3.0
17 stars 4 forks source link

Fail to find minimum #2

Closed JennaVergeynst closed 5 years ago

JennaVergeynst commented 5 years ago

Hi @baktoft,

I'm trying out your YAPS R-package. I have new data now with a smaller and more regular burst interval (theoretically about 1 second), so this should be great to feed to YAPS :-) However, I'm getting: Newton failed to find minimum. In addition: Warning message: In stats::nlminb(inp$inits, obj$fn, obj$gr) : NA/NaN function evaluation outer mgc: NaN Error in[.default(summary(jointrep), , 2) : incorrect number of dimensions

Does the warning message give you a clue about what's going wrong?

Below the code I'm using and attached a TOA file.

library(yaps)
set.seed(42)

pingType <- 'sbi'

toa_data <- read.csv(paste0(PATH, 'YAPS/TOA_data/TOA_46902.csv'), check.names = FALSE) # check.names False avoids that R puts X before receiver name
receivers <- colnames(toa_data)[3:(ncol(toa_data)-2)]
passings <- unique(toa_data$groups_pas)

hydro_data <- read.csv(paste0(PATH,'data/stations.csv'), row.names = 'Serial')
hydros <- hydro_data[rownames(hydro_data) %in% receivers,colnames(hydro_data) %in% c('X','Y')]
names(hydros) <- c("hx", "hy")

i=1
toa <- toa_data[toa_data$groups_pas ==passings[i],colnames(toa_data)%in% receivers]
toa <- t(as.matrix(toa))
T0 <- min(toa, na.rm=TRUE) #reshape so first TOA is 0
toa <- toa - T0

if(pingType == 'sbi'){
    inp <- getInp(hydros, toa, E_dist="Mixture", n_ss=10, pingType=pingType, sdInits=0)
} else if(pingType == 'rbi'){
    inp <- getInp(hydros, toa, E_dist="Mixture", n_ss=10, pingType=pingType, sdInits=0, rbi_min=rbi_min, rbi_max=rbi_max)
}
str(inp)

pl <- c()
maxIter <- ifelse(pingType=="sbi", 500, 5000)
outTmb <- runTmb(inp, maxIter=maxIter, getPlsd=TRUE, getRep=TRUE)

TOA_46902.txt

baktoft commented 5 years ago

Hi @JennaVergeynst, Thank you for your continued interest in YAPS :-). Could you send me the file with station positions as well? Then, I will try to have a look and get it running. Cheers, \hb

JennaVergeynst commented 5 years ago

Attached! Thank you! Yes, YAPS is our only hope on nice tracks ;-)

baktoft commented 5 years ago

Hi Jenna, I don't see the station position file? Just send me a PM with the file

JennaVergeynst commented 5 years ago

Try again...

baktoft commented 5 years ago

Hi again, I had a look at your data and found two major problems.

  1. order of hydros different between hydros and toa - they should be in the same order
  2. the TOA matrix had some issues regarding structure - see comments in script below... I attempted a quick and dirty fix for this which made the model run, but the result seems a bit off - probably due to the fix being to quick and dirty. Try to make a better TOA-matrix and see if that helps...

This script should run and produce a track of sorts...

library(yaps)
set.seed(42)

pingType <- 'sbi'

toa_data <- read.csv('TOA_46902.txt', check.names = FALSE) # check.names False avoids that R puts X before receiver name
receivers <- colnames(toa_data)[3:(ncol(toa_data)-2)]
passings <- unique(toa_data$groups_pas)

hydro_data <- read.csv('stations.csv', row.names = 'Serial')
hydros <- hydro_data[rownames(hydro_data) %in% receivers,colnames(hydro_data) %in% c('X','Y')]
names(hydros) <- c("hx", "hy")
plot(hydros)

# Check that order of hydros are the same as order in toa
rownames(hydros) == rownames(toa)
# Fix this...
hydros <- hydros[order(rownames(hydros)),]
# Better...
rownames(hydros) == rownames(toa)

# Make the grid centred close to 0;0
hydros[,'hx'] <- hydros[,'hx'] - 900
hydros[,'hy'] <- hydros[,'hy'] - 900

i=1
toa <- toa_data[toa_data$groups_pas ==passings[i],colnames(toa_data)%in% receivers]
toa <- t(as.matrix(toa))
T0 <- min(toa, na.rm=TRUE) #reshape so first TOA is 0
toa <- toa - T0

# TOA matrix needs to be very regular - i.e. one ping per column - this means include empty columns if a ping is not detected at all
# Check that TOA is well formatted
plot(diff(colMeans(toa, na.rm=TRUE))) # Doesn't look good - all points should be centred at the burst interval 1.25

# E.g. this section is not good - the ping at ~114 sec is found in three different columns
which(diff(colMeans(toa, na.rm=TRUE)) < 0.1)
toa[,90:95] 

# This section have a gap of ~5 seconds between columns 653 and 654
which(diff(colMeans(toa, na.rm=TRUE)) > 4)
toa[,650:655] 

# Below is a quick and dirty attempt to correct this TOA based on the data given...
# This should of course be done in a better way and sooner in the process, but just to try it out...
library(data.table)
rownames(toa) <- c()
long <- data.table(reshape2::melt(toa)[,c(1,3)])
colnames(long) <- c('h','toa')
long <- long[!is.na(toa), ]
head(long)

nping <- floor(max(long[,2]) / 1.25)
approx_toa <- 1:nping * 1.25

long_new <- data.table(expand.grid(1:16, approx_toa))
colnames(long_new) <- c('h','toa')

long[,roll:=toa]
long_new[,roll:=toa]

setkey(long, h, roll)
setkey(long_new, h, roll)

long_joined <- long[long_new, roll='nearest']
long_joined[which(abs(long_joined[,toa] - long_joined[,roll]) > 0.5)]$toa <- NA
long_joined <- long_joined[,c('h','toa','roll')]

new_toa <- reshape2::acast(long_joined, roll~h, value.var="toa")
rownames(new_toa) <- c()
new_toa <- t(new_toa)

# This is looking a bit better now, but not perfect...
plot(diff(colMeans(new_toa, na.rm=TRUE)))

inp <- getInp(hydros, new_toa, E_dist="Mixture", n_ss=10, pingType=pingType, sdInits=0)
str(inp)

pl <- c()
maxIter <- ifelse(pingType=="sbi", 500, 5000)
outTmb <- runTmb(inp, maxIter=maxIter, getPlsd=TRUE, getRep=TRUE)

str(outTmb)

# Estimates in pl
pl <- outTmb$pl

# Error estimates in plsd
plsd <- outTmb$plsd

# plot the resulting estimated track
# It doesn't look quite right, but the way to correct TOA above is really not the right way to do it ;-)
plot(hy~hx, data=hydros, type="p", asp=1, xlim=c(-100,300))
lines(pl$X~pl$Y, col="red")

Let me know how it turns out :-) \hb

JennaVergeynst commented 5 years ago

Hi Henrik,

Thanks for these ideas! I realised that the ping interval is actually not fixed, but random between 1.1 and 1.3 seconds, so I should change the pingtype to 'rbi' I suppose. Also there was another ping interval of 50 to 70 seconds present, of another kind of pings, which I could throw out, but probably this also altered the quality of the track...

So for the random ping interval, is it also needed to have the TOA matrix filled in where there are gaps? So for instance fill it with NaN rows on every 1.2 second, as this is the average burst interval? Or would it be better to split the matrix when there is a too large gap (say 5 seconds), and let YAPS run over the different matrices seperately (and still for this matrices fill the smaller gaps)...?

Then another question: why is it that you need the grid centred close to (0,0)?

Thanks a lot!

JennaVergeynst commented 5 years ago

Btw, with pingtype on random burst interval and cleaned TOA matrix, I'm getting a nice track! :-)

baktoft commented 5 years ago

Hi Jenna, Is the burst interval uniform distributed between 1.1 and 1.3 or normal distributed around 1.2? In either case, you should be fine specifying it as a stable burst interval - you will just not get the benefits of a really stable burst interval ;-). What are the other pings at 50 to 70 sec? Where do they come from - the same tag? Sounds strange...

Whether you use stable or random BI, your TOA matrix should have one column per ping - period. No more - no less. It would be better to split the process if you have large gaps, but the threshold is much higher than 5 sec. I typically use something like 30 or 60 minutes. In cases where you have such large gaps, the requirement for one ping per column is relaxed in the gaps, as long as there is an approximately correct number of columns filling the gap.

The 0,0 is not really mandatory, but good to use when debugging. Just don't use raw UTM coordinates ;-)

Glad to hear that you got a better track with the cleaned TOA :-)

Cheers, \hb

JennaVergeynst commented 5 years ago

Hi Henrik,

It's a uniform distribution I guess. Actually Vemco configures the tag so that it emits randomly between 1.1 and 1.3. They do so to avoid that to fish would ping constantly at the same moment and would cause constant collisions...

The other pings come from the same tag indeed. Well, the 1.1-1.3 are HR-transmissions (High Residence, their new technique), those have a short pulse of about 6 ms. The 50-70s are PPM tranmissions (Pulse Position Modulation, their old technique which takes some seconds to transmit). I think they use both because older receivers only detect PPM, for presence/absence detection further in the network.

So in which case would you use rbi than, only if the interval is of a certain size?

Really 30 to 60 minutes? So the algorithm should be able to cope with 30 minutes silence and than create a new track? Then the output would consist of different seperated tracks?

baktoft commented 5 years ago

Well, the output for silenced period will just be an interpolation between the bordering known positions....

I would use rbi on data from the PPM transmissions. This is kind of a worst case scenario in which there is absolutely no information to gain from the burst intervals. Another vendor use pseudo-random burst intervals which do contain info that can be used in the model. An implementation for pseudo-random BI is coming up eventually ;-)

JennaVergeynst commented 5 years ago

Hi,

Just to mention that I still need to use the rbi version for these data: when I use sbi I get a track that is hugely different, and with mean error of 21 m (versus 0.7 m for rbi)...

46902_rbi_vs_sbi

Btw, I have some code to prepare this cleaned TOA-matrix where the gaps are filled and double rows for one ping omitted (as good as possible), departing from synchronised detection data. It is based on the data format as delivered by Vemco, and it is also in python... But it might be a help to start from for other people who want to use yaps, so maybe it's interesting to share it on your github page? Let me know if you're interested, then I'll make a branch.

baktoft commented 5 years ago

Thanks for the feed back and info re sbi vs rbi for these data. I am a bit surprised by the crappy track from sbi, but so be it. Maybe changing the inits would help... Anyways, guess the rbi solution looks ok. Do you have any gps'ed test tracks to test it against?

If your code to prepare the TOA-matrix is documented etc, you are very welcome to put it in a branch. I will link to it in the readme, so others can find it :-). Doing the hydro synchronization seems to be a huge obstacle for many, so any hints towards solutions are nice. Unfortunately, my own scripts for this is at the moment not ...ahm... documented and ready for sharing... ;-)

JennaVergeynst commented 5 years ago

No gps tracks unfortunately, but we do have vps tracks (so the tracks processed by the company) to compare it with. And the rbi one is close the vps track (after filtering the latter).

I do have code for synchronisation as well, that I have tried to make understandable. I'll make a branch some day for both! :-)