The wiki extraction scripts are pretty good. However, the process could be streamlined a little, and the current code base improved. The following should be implemented:
Use sessions for GET requests to reduce overhead for client/server
Auto determine last date of extraction so this does not need to be manually entered
Could try a file timestamp (simple)
Or last page update added to extracted data (more complex)
The wiki extraction scripts are pretty good. However, the process could be streamlined a little, and the current code base improved. The following should be implemented: