Robert-W / grunt-ftp-push

Deploy files to an FTP server as part of your Grunt workflow.
MIT License
39 stars 14 forks source link

FR: only push modified files since last push #45

Closed noerw closed 8 years ago

noerw commented 8 years ago

the mtime (and size?) of both the local file and remote file could be compared, and then decided, whether to push a file or not. this might create some networking overhead, but would quickly write off with large files. this could be configured for a target via an option

patch: true

or

patch: {
  serverTimeOffset: <offsetInSeconds>
}

where serverTimeOffset defines the difference between local and remote clock

evanplaice commented 8 years ago

@noerw Comparing files based on mod time is the issue because the FTP protocol wasn't designed with incremental sync in mind.

The FTP protocol doesn't include a command to fetch the server time, let alone an easy/standard way to extract the timezone offset.

The only hacky approach I've found to workaround this limitation is to create a temporary file on both the local and remote so the timestamps can be compared.

The comparator used to measure the time difference also have to have a sufficient 'fudge factor' to account for network latency involved in uploading the temp file and to ensure a 'close enough' match while comparing existing files.

This feature has been discussed (here)[https://github.com/evanplaice/node-ftpsync/issues/21] because node-ftpsync already does incremental updates based on file size and mod time comparison may be included in a future update.

Robert-W commented 8 years ago

Hey Guys,

This is something I have thought about adding but just have not had a chance to work on as I am a little preoccupied with work and a couple of other projects at the moment. Making extra trips to the server to pull down files and look at last modified times or creating temp files in both locations is not ideal in my opinion, and there are not any simple commands to get all last modified times that won't be costly that I am aware of. However, there may be some other approaches that could accomplish this much more efficiently.

One that I may try to explore when I have some free time in the near future is to create a local temp file in the plugin (not something that would ever show up in your src code, just in this plugins folder) that would be a key value listing of files and hashes. Essentially when any content is uploaded, create a hash for the file and store it, then when it goes to upload the files again, hash it and compare the hashes, if they are different then something changed. This should work with almost no overhead, depending on the complexity of the hashing function.

It may be at least a week or two before I have a chance to look into this so if either of you would like to contribute that would be awesome, if not, I have to catch up on some other projects first before I can attend to this. (Just finished school so I am little behind on things haha)

evanplaice commented 8 years ago

I already maintain node-ftpsync and grunt-ftpsync as well as many other projects so, not looking to pick up any more.

I'm game to discuss about the specifics.

Robert-W commented 8 years ago

I hear ya, the concept is really simple and there is a couple of ways of doing it. Depending on how you architected your modules the implementation should be simple as well. It would just require storing a key-value list in a file somewhere locally where the key is the path to the file and the value is a content hash/file stats/ or some other method for uniquely identifying the state of a file. There is an interesting article about the way git does similar things here http://www-cs-students.stanford.edu/~blynn/gg/race.html. Those methods and what git does is a little much for a simple plugin like you and I have written but a lighter version could probably be achieved relatively easily and some of the problems mentioned we won't encounter because were not going to need to catch sub-second changes. Also, see http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html under the Indexing section where it mentions using stat calls on files and indexing those instead of reading the files, which should be much faster than content hashing

Robert-W commented 8 years ago

@noerw I will be taking a look at this over this weekend and next weekend and be posting updates here, but am hoping to have this implemented in the next week or two

Robert-W commented 8 years ago

UPDATE: Sorry for the delays, I have a branch with some passing test cases, just need to integrate it across the project and document it and it will be deployed soon.

Robert-W commented 8 years ago

This is finally out, you can test on version 1.1.0