Open dreadatour opened 2 weeks ago
How about materialise instead of persist? Just a suggestion.
.persist()
is the name of the method in the dataframe API standard. I think that's what we should use - assuming it works exactly as described in the standard.
Follow-up for the https://github.com/iterative/datachain/issues/327
Sometimes it is useful to save intermediate chain state, because operations are lazy, chains are not executed immediately and intermediate results are not stored.
For example, if we want to create
dc_filtered_1
anddc_embeddings
fromdc
, without saving intermediate dc chain will be executed twice, for each children.It is possible to do it with
save()
method withoutname
param, also we haveexec()
method, but it looks likepersist()
is better and more verbose name for this method.After
persist()
method will be implemented, we may want to makename
param insave()
method mandatory.