ArchitecturalKnowledgeAnalysis / EmailIndexer

Utility for generating Lucene indexes for collections of emails.
MIT License
1 stars 2 forks source link

Implemented Filter Exporter #8

Closed wmeijer221 closed 2 years ago

wmeijer221 commented 2 years ago

Added a FilterExporter class that can export data using a SearchFilter instead of a search query. To do this, the structure of the code had to be changed.

Originally, all concrete exporters (PdfExporter etc.) inherited QueryExporter which in turn inherited EmailDatasetExporter. Because this essentially meant that the concrete exporters implemented none of the EmailDatasetExporter methods, they have been completely decoupled from this class, and instead share a common interface TypeExporter now, which has all of the template methods beforeExport etc.

I introduced a new abstract class SampleExporter which inherits the EmailDatasetExporter class abstracting away the asynchronous logic of the concrete sample exporters (i.e. QueryExporter and FilterExporter). This abstraction is not essential to have, but it resolves redundant code writing.

In the new architecture, implementing classes will have to create a TypeExporter and the ExporterParameters and pass it in the constructor of the concrete SampleExporter. Beyond that, exporting works the same as it did before. I removed the QueryeExporterParameters class, as it essentially is a simplified derivative of ExporterParameters.

Note, to export, I use the EmailSearcher class, not the EmailIndexSearcher, as there is no corresponding search implementation for search filters. Considering only a limited number of emails are going to exported at any given time, I didn't see any need to implement this functionality. The solution does load emails using a static page size, in an attempt to minimize memory overload (assuming this does indeed minimize memory overload).

I hope you like my changes! :D

andrewlalis commented 2 years ago

Thanks for the improvements! I do like the fact that you were able to unify exporting query and non-query searches.