gwu-libraries / TweetSets

Service for creating Twitter datasets for research and archiving.
MIT License
25 stars 2 forks source link

spark-reload should remove files from previous loads. #160

Open lwrubel opened 2 years ago

lwrubel commented 2 years ago

If there are files from a previous pre-2.2.0 load, the spark-reload command is not removing those files from the full-dataset/ dir. This leads to the following display in the UI:

Screen Shot 2021-11-15 at 7 56 11 AM

Also confirm that files get removed with spark-reload >= 2.2.0.