Something I noticed with the API is that you require, by convention, that someone call download/parse/nlp before accessing properties. This is a lazy evaluation problem, because the the common approach is to have the object ready to use when instantiated. Instead, we don't want to hammer the server each time an object is created if the data the property depends on isn't available.
Instead of the approach by enforcing convention and throwing errors, a more user friendly approach is to perform the operations as needed.
One library that helps with this is lazy, but this doesn't solve the whole problem.
You need to do something like this. See pseudocode below:
class Article(object):
def __init__(self, url):
self.url = url
self._raw_article = None
self._parsed_article = None
@property
def raw_article(self):
if self._raw_article is None:
self.download()
return self._raw_article
@property
def parsed_article(self):
if self._parsed_article is None:
self.parse()
return self._parsed_article
@property
def title(self):
# the check for not None here will inherently lazy load the requirement
# can check for other requirements as well here
if self._parsed_article is not None:
return self._parsed_article.title
Alternatively, something like lazy makes this more simple:
Something I noticed with the API is that you require, by convention, that someone call download/parse/nlp before accessing properties. This is a lazy evaluation problem, because the the common approach is to have the object ready to use when instantiated. Instead, we don't want to hammer the server each time an object is created if the data the property depends on isn't available.
Instead of the approach by enforcing convention and throwing errors, a more user friendly approach is to perform the operations as needed.
One library that helps with this is lazy, but this doesn't solve the whole problem.
You need to do something like this. See pseudocode below:
Alternatively, something like lazy makes this more simple: