man-group / notebooker

Productionise & schedule your Jupyter Notebooks as easily as you wrote them.
GNU Affero General Public License v3.0
848 stars 80 forks source link

If a notebook file is renamed or moved around it loses history #159

Open marcinapostoluk opened 9 months ago

marcinapostoluk commented 9 months ago

Old results are then showing separately from the new results. Might be good to provide a Move feature re-linking the old results to the modified report.

marcinapostoluk commented 9 months ago

So far I moved my notebooks to folders semi-manually by running:

CAREFUL! need to run on same version of apscheduler as otherwise pickling results in incorrect data and scheduler will crash on startup

from ahl.mongo import PyMongoose from ahl.mongo.auth import authenticate from mkd.auth.mongo import get_auth mongo = PyMongoose("[HOST]")._conn["[DB]"] authenticate(mongo, "[USER]", "[PASS]")

from_path = "[old report path]" to_path = "[new report path]"

modify paths

modified = [] lib = mongo["[lib]"]

for c in lib.find(): if c["report_name"] == from_path: c["report_name"] = to_path modified.append(c) for m in modified: lib.replace_one({"_id": m["_id"]}, m)

modify schedules

import pickle modified = [] loaded = mongo["[lib]_scheduler"].find() for l in loaded: state = pickle.loads(l["job_state"]) if state["kwargs"]["report_name"] == from_path: state["kwargs"]["report_name"] = to_path print("replaced") l["job_state"] = pickle.dumps(state) modified.append(l) for m in modified: mongo["ahl_fts_scheduler"].replace_one({"_id": m["_id"]}, m)

marcinapostoluk commented 5 months ago

Need also this to correct the schedules fully:

for c in lib.find(): if c['_id'] == old_path: found = c

job_state = pickle.loads(found['job_state']) job_state['kwargs']['scheduler_job_id'] = new_path job_state['id'] = new_path found['job_state'] = pickle.dumps(job_state) found['_id'] = new_path lib.save(found) lib.delete_one({'_id': old_path})