Open NicolaDonelli opened 3 years ago
Being the "archive" a side-effect probably makes also sense to return None (and this would also give the same output between archiveOne and archiveMany, which should always be None). Another possibility (but I would do it later in the future when we integrate Mondas nicely in the package) would be to return an Either, in order to also handle errors functionally.
about the arguments of retrieve it is actually quite a pain. Unfortunately we don't have an abstraction for "Query" yet, that is an object that represents the filters one wants to apply when querying persistence layers that is agnostic from the actual implementation (SQL wants strings, Pandas we could use functions, MongoDB we use dicts). I agree, this is really bothering me, but unfortunately, I don't quite have an elegant solution for handling this.
What we started to do in CIB News is to extend from the Archiver, implement methods that do the querying and call the retrieve with the generic output, e.g.
class DocumentsArchiver(Archiver):
@abstractmethod
def byPublishDate(publishDate: str): ...
class MongoDocumentsArchiver(MongoArchiver, DocumentsArchiver):
def byPublishDate(publishDate: str):
return self.retrieve({"publishDate": publishDate})
class PandasDocumentsArchiver(PandasArchiver, DocumentsArchiver):
def byPublishDate(publishDate: str):
return self.retrieve(lambda df: df[df["publishDate"] == publishDate])
In the code, we never use retrieve directly, but we generally use the custom retrieving methods, such as byPublishDate
But maybe, here, @nromeoarena can advise what it is best to do
In my opinion, we should avoid function returning Node. For functions like archive, I do agree with @deusebio that Either could be the best solution. It will at least return the status of the operation in order to track errors.
As an input object for the archive, I also agree with @deusebio and we should think more about a "Query" abstract object that will implement query in a specific language like QueryMongo, QueryPandas, QuerySQL. We can then create a factory that actually builds specific queries instead of putting the logic of a query inside the Archiver. As example
class ClientQueryBuilder(MongoQueryBuilder): def retrieveByDate(data :str) -> MongoQuery: def retrieveClientByRegion(region: str) -> MongoQuery:
the archiver will then contain only the archive method def archive(query: Query) -> Either
and def retrieve(query: Query) -> Either
, according to this refactoring the archiver will only contain the "connection to the database" logic.
Moreover if we centralize the querybulder/queryfactory we have a centralized point where we actually build the query and we do not need to write the same query in different part of the code without touching the archiver that will be finally independent form business logic.
There are a few possible issues with cgnal.data.layer.mongo.archivers.MongoArchiver class.
@deusebio @memoclaudio @aiola