Open chooliu opened 1 week ago
Hi, thank you for you interesting. That's a good question. I probably overlooked to implement a loading function. At this moment I think the most straightforward way to do is just to use pickle to dump and load trained Higashi_model instances.
Thanks for the thoughts Ruochi! I got the chance to try this today, but think my naive attempt at pickling results in an error.
The following code runs without any clear errors and will begin the imputation via higashi_model.train_for_imputation_nbr_0()
if I run it all in one session. (Will just time out on that step due to my computing cluster constraints)
from higashi.Higashi_wrapper import *
from fasthigashi.FastHigashi_Wrapper import *
config = "higashi_config/config.json"
higashi_model = Higashi(config)
higashi_model.process_data()
fh_model = FastHigashi(config_path = config,
path2input_cache = "higashi_cache",
path2result_dir = "higashi_output",
off_diag = 100, filter = False, do_conv = False,
do_rwr = False, do_col = False, no_col = False)
fh_model.prep_dataset()
fh_model.run_model(dim1 = 0.6, rank = 256, n_iter_parafac = 1,extra = "")
higashi_model.prep_model()
higashi_model.train_for_embeddings()
# added pickle dump section --------------------------------------
with open("higashi_output/fh_model.pickle", "wb") as f:
pickle.dump(fh_model, f)
with open("higashi_output/higashi_model.pickle", "wb") as f:
pickle.dump(higashi_model, f)
# ------------------------------------------------------------
higashi_model.train_for_imputation_nbr_0()
higashi_model.impute_no_nbr()
higashi_model.train_for_imputation_with_nbr()
higashi_model.impute_with_nbr()
However, splitting it into two jobs and trying to pickle load higashi_model
seems to result in the following error.
higashi_model = pickle.load( open("higashi_output/higashi_model.pickle", "rb") )
higashi_model.train_for_imputation_nbr_0()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "higashi-0.1.0a0-py3.10.egg/higashi/Higashi_wrapper.py", line 1374, in train_for_imputation_nbr_0
self.train_for_imputation_no_nbr()
File "higashi-0.1.0a0-py3.10.egg/higashi/Higashi_wrapper.py", line 1379, in train_for_imputation_no_nbr
del self.higashi_model, self.node_embedding_init
AttributeError: higashi_model
Please let me know if I'm missing something obvious! Will try to make a reproducible example on smaller (low # cell) dataset in the meantime.
Hi Ruochi, thanks so much for developing Higashi & FastHigashi.
I've been trying to obtain the cell-level imputed with neighbor matrices following the newer Fast-Higashi tutorials/Ramani et al.ipynb workflow on a larger dataset in which it's difficult to request enough compute time on our cluster to complete the training and imputation in one go.
I notice in the Higashi API notes that the
temp_dir
should store intermediate outputs in case of interruption, but have not been able to get this to resume. Wanted to ask if there's suggested commands to make sure these load properly to the right object structure / if there's commands to skip especially in the Higashi+FastHigashi case--or if the intermediate results should automatically load given the same config file.Namely, I can usually get through
higashi_model.prep_model()
, andfh_model.run_model()
but after this is there a proper way to load.higashi_model
in a new session?Cheers!