So the goal was to avoid duplicate fetch of a post object that is already in Mongo. Alas, even if the post object is in mongo, we might have fetched it or not.
Hence, the post has three states:
1) Not in mongo
2) In mongo but did not fetch HTML. In this case the content field in the Post object will contain the "summary"
3) In mongo and have the HTML saved. In this case the content field in the Post object will contain the HTML.
Hence, I suggest that we have a state field to the Post object which will have the following states:
1) CREATED / WRANGLED / FETCHED
Also add a method:
is_in_mongo() and was_fetched()
And the following behavior
! is_in_mongo() -> wrangle the post and fetch the html
is_in_mongo() && ! was_fetched() -> fetch the html and set the content to the html
is_in_mongo() && was_fetched() -> get the post from mongo and return it.
So the goal was to avoid duplicate fetch of a post object that is already in Mongo. Alas, even if the post object is in mongo, we might have fetched it or not.
Hence, the post has three states:
1) Not in mongo 2) In mongo but did not fetch HTML. In this case the content field in the Post object will contain the "summary" 3) In mongo and have the HTML saved. In this case the content field in the Post object will contain the HTML.
Hence, I suggest that we have a state field to the Post object which will have the following states:
1) CREATED / WRANGLED / FETCHED
Also add a method:
is_in_mongo() and was_fetched()
And the following behavior ! is_in_mongo() -> wrangle the post and fetch the html is_in_mongo() && ! was_fetched() -> fetch the html and set the content to the html is_in_mongo() && was_fetched() -> get the post from mongo and return it.