Closed nikithkurella closed 7 months ago
Hi nikithkurella,
That is a great question. You have a custom step for trajectory inference, whose output needs to be reformatted for Dictys. I suggest to first start from the dynamic GRN inference tutorial at https://github.com/pinellolab/dictys/blob/master/doc/tutorials/full-skin/notebooks/main1.ipynb. Upon finishing this, you can check the trajectory data files (tmp/branch.tsv.gz, tmp/edge.tsv.gz, tmp/dist.tsv.gz) for their formats.
For your own data you can replace this notebook with the following. First you can identify two cells, namely the initial and final cells, with the smallest and largest pseudotime. You can place one trajectory node each at their respective locations. There's one edge between the two nodes, and all cells are on this edge. Finally, the distance from each cell to all nodes can be computed easily as their pseudotime difference with the initial or final cells.
Following this process allows you to generate the three trajectory data files. Then you can run https://github.com/pinellolab/dictys/blob/master/doc/tutorials/full-skin/notebooks/main2.ipynb to infer dynamic GRNs as in the tutorial.
Please follow up know if anything is unclear.
Lingfei
Hi @lingfeiwang thank you so much for your response. I followed your directions to prepare the prerequisite files for the main2 notebook, and I've run into the following error:
ValueError Traceback (most recent call last)
Cell In[14], line 19
17 traj=dictys.traj.trajectory.fromdist(edge.values,dist.values)
18 traj.to_file('[/scratch1/kurella/Dictys/yAL/data/traj_node.h5](https://ondemand.carc.usc.edu/node/a02-09.hpc.usc.edu/56123/lab/tree/Dictys/yAL/Dictys/yAL/data/traj_node.h5)')
---> 19 point=dictys.traj.point.fromdist(traj,branch.values,dist.values)
20 point.to_file('/scratch1/kurella/Dictys/yAL/data/traj_cell_rna.h5',traj=False)
File [~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py:525](https://ondemand.carc.usc.edu/node/a02-09.hpc.usc.edu/56123/lab/tree/Dictys/yAL/~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py#line=524), in point.fromdist(cls, traj, edges, dist)
523 edges=np.array([traj.edgedict[tuple(x)][0] for x in edges])
524 locs=dist[np.arange(len(edges)),traj.edges[edges,0]]
--> 525 return cls(traj,edges,locs,dist=dist)
File [~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py:490](https://ondemand.carc.usc.edu/node/a02-09.hpc.usc.edu/56123/lab/tree/Dictys/yAL/~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py#line=489), in point.__init__(self, traj, edges, locs, dist)
488 self.edges=edges
489 assert locs.shape==(self.npt,)
--> 490 locs=traj.conform_locs(locs,edges)
491 self.locs=locs
492 if dist is None:
File [~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py:208](https://ondemand.carc.usc.edu/node/a02-09.hpc.usc.edu/56123/lab/tree/Dictys/yAL/~/.conda/envs/dictys/lib/python3.9/site-packages/dictys/traj.py#line=207), in trajectory.conform_locs(self, locs, edges, abs_err, rel_err)
206 t1=np.nonzero((eabs>abs_err)&(erel>rel_err))[0]
207 if len(t1)>0:
--> 208 raise ValueError(f'Absolute error {eabs[t1[0]]}>{abs_err} and relative error {erel[t1[0]]}>{rel_err} both exceed the given bounds.')
209 locs=np.clip(locs,0,self.lens[edges])
210 return locs
ValueError: Absolute error 3.7595970003856394e-07>1e-07 and relative error 2.8641798961666455e-07>1e-07 both exceed the given bounds.
I assume it has to do with an issue in my preparing of the files. I was wondering if you could point me in the right direction. Here's the code that I'm running for reference:
# Removes CPU usage limit by some jupyter versions
import os
os.environ['KMP_AFFINITY'] = ''
# Configure matplotlib to enable large animations
import matplotlib
matplotlib.rcParams['animation.embed_limit'] = 2**128
import matplotlib.pyplot as plt
# Prepare trajectory files
import pandas as pd
import dictys
# Load data
dist=pd.read_csv('/scratch1/kurella/Dictys/yAL/tmp/dist.tsv.gz',header=0,index_col=0,sep='\t')
edge=pd.read_csv('/scratch1/kurella/Dictys/yAL/tmp/edge.tsv.gz',header=None,index_col=None,sep='\t')
branch=pd.read_csv('/scratch1/kurella/Dictys/yAL/tmp/branch.tsv.gz',header=None,index_col=None,sep='\t')
# Save data
traj=dictys.traj.trajectory.fromdist(edge.values,dist.values)
traj.to_file('/scratch1/kurella/Dictys/yAL/data/traj_node.h5')
point=dictys.traj.point.fromdist(traj,branch.values,dist.values)
point.to_file('/scratch1/kurella/Dictys/yAL/data/traj_cell_rna.h5',traj=False)
Hi nikithkurella. That appears due to a strong validation check which I have just relaxed. Could you install the latest version and try again?
Thank you so much @lingfeiwang! That seems to have resolved the issue.
Hello. I was wondering how I might approach running dynamic inference if I already have the pseudotime values for my dataset. I have only two cell types and one branch.