Closed varenc closed 9 months ago
That's interesting. I removed circular references quite a few versions ago—there is an adjacent reference to Items
both in the Order
and also in Order.shipments
, though that'd be a subset (assuming multiple shipments) and that should still be picklable. However, I also just tried a json.dumps(orders)
, and that fails too saying Order
is not serializable, so probably I need to look in to that a bit more. It used to be earlier on in development, so I must've added something that broke serialization ... at present I can't think what wouldn't be serializable on these objects, but they're fairly well nested, so it might require some sleuthing.
I don't know when I'll get a chance to look at this, but here are the objects and their associated types—this and anything in the encompassing entity
package. Let me know if anything jumps out as you as an obvious issue for serializing these objects.
Quick update. From my research, the issue appears to be with BeautifulSoup, apparently its Tag
objects are not pickable. Every Parsable
object contains a repr of its unparsed self in parsed
, which is a Tag
object.
A quick and dirty test adding this to Parsable
made pickling succeed for me:
def __getstate__(self):
delattr(self, "parsed")
return self.__dict__
Care to try this on your fork? I don't know that this is the direction we actually want to go in, but it will confirm that the parsed
field on the base entity is the issue.
Alright, the above modifies the actual object when you pickle, so here's what I went with:
def __getstate__(self):
state = self.__dict__.copy()
state.pop("parsed")
return state
This is now deployed in v1.0.13
, as I'd consider this a bug and wanted to get it patched.
I don't know that caching support is something we'll add natively, so I'll close this ticket. If the issue persists with the latest version, feel free to comment with more details and we can reopen if needed. Thanks!
Acknowledgements
debug
mode and am attaching relevant console logs and HTML filesDescribe the Feature
I'd like to be able to easily cache/memoize my Amazon orders. This makes sense to me since every order from any year besides the current one will essentially never change. At the moment I'm working on a personal script and am constantly re-fetching them from Amazon. I worry that doing this too much might trigger some hacking/abuse detection stuff from Amazon and they'll lock my account. Also it's just a bit slow.
I'd love it if there was a built in cache. Of course, full caching support could be a big feature, so a simpler request would just to make it so an
Order
object is serializable and I can cache them myself. If I try to dopickle.dump(some_order, file_handler)
I get aRecursionError: maximum recursion depth exceeded
error. I assume this because Orders contain some sort of circular reference though I haven't investigated why.Describe Alternative Solutions/Workarounds
At the moment I don't have a clear workaround. If I were to approach this myself I'd probably just dig into the code and figure out how to use some sort of cache on all HTTP requests. There's caching wrappers for
requests
that you can configure so they cache everything (ignore normal HTTP caching rules) so that'd be one alternative approach. Though making this a first class feature is more work.