Open bridwell opened 5 years ago
Hi @bridwell, this is a good question.
My initial reaction is that it would be nice to finally implement a dependency graph so we can automatically invalidate the cache in these cases. (Your question prompted me to look back through some earlier discussions about this -- linking https://github.com/UDST/orca/issues/15 for future reference.)
Since the templates are a lot stricter than free-form Orca steps, we should have enough information to build a full dependency graph for ad-hoc columns now -- we're tracking exactly what the inputs are whenever we use an @orca.column()
decorator under the hood, like in column_from_expression.py.
And just to note, the templates do support specifying the cache_scope
whenever a table or column is generated, with the same defaults as Orca: output_column.py#L30.
But you might be right that accepting a list of items whose cache should be cleared at the end of a step is the most expedient solution here.
In our system, we often have to clear orca caches manually at the end of a given model step. For example, after running household location choice, we need to update the occupancy characteristics of buildings for later steps. Would it make sense for all templates to optionally accept a list of orca injectables/tables/columns that should be cleared at the end of the step?