I've noticed some .doc reports from IG's, and it would be good to extract text from them. This could use Abiword's command line interface, or LibreOffice's unoconv. As a bonus, unoconv could turn the report into a PDF too, if we'd like to standardize on that.
Edit: And it looks like we can extract metadata, including the last modified time, using the venerable file.
I've noticed some .doc reports from IG's, and it would be good to extract text from them. This could use Abiword's command line interface, or LibreOffice's unoconv. As a bonus, unoconv could turn the report into a PDF too, if we'd like to standardize on that.
Edit: And it looks like we can extract metadata, including the last modified time, using the venerable
file
.