Closed ethanwhite closed 9 years ago
After a little looking around I think the problem may be related to the IDs. The files that are initially being downloaded have names of the form:
Lat33.00000Lon-90.00000Start2014-01-01End2014-12-31___MOD13Q1.asc
It looks like all what's being done to generate the IDs is to strip off ___MOD13Q1.asc
, in which case these don't match the IDs that are being generated by UpdateSubsets
, which are of the form:
Lat33.00000Lon-90.00000Start2000End2014
I think this means that when https://github.com/seantuck12/MODISTools/blob/master/R/UpdateSubsets.R#L57 is executed that the desired subsetting isn't happening.
Hi Ethan, yes as you've noticed UpdateSubsets is not dealing with the presence or absence of subset IDs in a clever way, which is causing this bug. I'm overhauling the code in a big way at the moment and this will be one of the functions getting a makeover. In the meantime, I'll push a quick fix as soon as possible.
I've pushed a fix to the master repository. It should now return a trimmed version of the input data.frame – where all subsets that have already been downloaded and saved in the specified directory are removed – whether or not subsets have ID names. Thanks for pointing out the bug, do let us know if you have any further problems.
A second reason why your example would not work would be that MODISSubsets downloaded data with StartDate = FALSE (default), whereas UpdateSubsets has StartDate = TRUE. We've since decided that having start dates as optional is a bad idea, as it is confusing and introduces the opportunity for users to unknowingly download the wrong time series. I'm in the process of deprecating this option from all functions so in future versions start dates will be compulsory.
I've pushed a fix to the master repository.
Thanks for the quick fix! It looks like it's working great.
A second reason why your example would not work would be that MODISSubsets downloaded data with StartDate = FALSE (default), whereas UpdateSubsets has StartDate = TRUE.
Yeah, I figured that one our early this afternoon after hacking around the other issue. I agree that it's confusing so I think it's a good call on the update. Thanks for coming back to point it out to me.
I've been trying to use
UpdateSubsets
to help recover when DAAC hangs in the middle of a large download. Based on this part of the description:I expected that when some data had already been downloaded
UpdateSubsets
would return a shorter dataframe than that of the original data, which could then be passed toMODISSubsets
to resume downloading the data. That doesn't seem to be the case for me. Here's a simple example:It appears that even though
UpdateSubsets
identified the 3 subsets that had been previously downloaded, runningMODISSubsets
on the output ofUpdateSubsets
results in downloading all 6 files.coord_data
andunacquired_coord_data
contain the same sites, the only difference is presence of the ID column.I'm probably just misunderstanding something about how
UpdateSubsets
is supposed to work. Any help you can provide in pointing me in the right direction would be appreciated.