YosefLab / Cassiopeia

A Package for Cas9-Enabled Single Cell Lineage Tracing Tree Reconstruction
https://cassiopeia-lineage.readthedocs.io/en/latest/
MIT License
75 stars 24 forks source link

ILP solver failing #213

Closed Marius1311 closed 1 year ago

Marius1311 commented 1 year ago

Hi there, when trying to run the ILP solver on the Chan et al., Nature 2019 data (on embryo 1) using

ilp_solver = cas.solver.ILPSolver(convergence_time_limit=500, maximum_potential_graph_layer_size=10000, weighted=True, seed=1234)
ilp_solver.solve(cas_tree)

I get:

[2023-08-17 10:18:34,344]    INFO [ILPSolver] Solving tree with the following parameters.
[2023-08-17 10:18:34,344]    INFO [ILPSolver] Convergence time limit: 500
[2023-08-17 10:18:34,345]    INFO [ILPSolver] Convergence iteration limit: 0
[2023-08-17 10:18:34,345]    INFO [ILPSolver] Max potential graph layer size: 10000
[2023-08-17 10:18:34,346]    INFO [ILPSolver] Max potential graph lca distance: None
[2023-08-17 10:18:34,346]    INFO [ILPSolver] MIP gap: 0.01
[2023-08-17 10:18:34,351]    INFO [ILPSolver] Phylogenetic root: (0, 0, 0, 0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[26], line 1
----> 1 ilp_solver.solve(cas_tree)

File ~/mambaforge/envs/moslin/lib/python3.10/site-packages/ngs_tools/logging.py:62, in Logger.namespaced.<locals>.wrapper.<locals>.inner(*args, **kwargs)
     60 try:
     61     self.namespace = namespace
---> 62     return func(*args, **kwargs)
     63 finally:
     64     self.namespace = previous

File ~/Projects/Cassiopeia/cassiopeia/solver/ILPSolver.py:211, in ILPSolver.solve(self, cassiopeia_tree, layer, collapse_mutationless_edges, logfile)
    206         max_lca_distance = max(
    207             max_lca_distance, lca_distances[i] + lca_distances[j] + 1
    208         )
    210 # infer the potential graph
--> 211 potential_graph = self.infer_potential_graph(
    212     unique_character_matrix,
    213     pid,
    214     max_lca_distance,
    215     weights,
    216     cassiopeia_tree.missing_state_indicator,
    217 )
    219 # generate Steiner Tree ILP model
    220 nodes = list(potential_graph.nodes())

File ~/Projects/Cassiopeia/cassiopeia/solver/ILPSolver.py:307, in ILPSolver.infer_potential_graph(self, character_matrix, pid, lca_height, weights, missing_state_indicator)
    270 def infer_potential_graph(
    271     self,
    272     character_matrix: pd.DataFrame,
   (...)
    276     missing_state_indicator: int = -1,
    277 ) -> nx.DiGraph:
    278     """Infers a potential graph for the observed states.
    279 
    280     Using the set of samples in the character matrix for this solver,
   (...)
    303         A potential graph represented by a directed graph.
    304     """
    306     potential_graph_edges = (
--> 307         ilp_solver_utilities.infer_potential_graph_cython(
    308             character_matrix.values.astype(str),
    309             pid,
    310             lca_height,
    311             self.maximum_potential_graph_layer_size,
    312             missing_state_indicator,
    313         )
    314     )
    316     if len(potential_graph_edges) == 0:
    317         raise ILPSolverError("Potential Graph could not be found with" 
    318                             " solver parameters. Try increasing"
    319                             " `maximum_potential_graph_layer_size` or"
    320                             " using another solver.")

File ~/Projects/Cassiopeia/cassiopeia/solver/ilp_solver_utilities.pyx:21, in cassiopeia.solver.ilp_solver_utilities.infer_potential_graph_cython()

ValueError: Does not understand character buffer dtype format string ('w')

cas.__version__ gives '2.0.0', and I'm enclosing the character matrix which seems to cause the problem. I installed gurobi via conda following https://www.gurobi.com/documentation/9.1/quickstart_mac/cs_anaconda_and_grb_conda_.html#subsubsection:Anaconda

embryo1_character_matrix.csv

colganwi commented 1 year ago

c1a224f18aee3991f24b2994db722c06993f672f should solve this issue. Please reinstall the latest version of Cassiopeia and try running this test again. If the issue persists please run conda list in your environment and paste the results below so we can debug.

Marius1311 commented 1 year ago

Happy to confirm that indeed https://github.com/YosefLab/Cassiopeia/commit/c1a224f18aee3991f24b2994db722c06993f672f seems to have fixed this! Thanks for the support @colganwi!