gifford-lab / prescient

Codebase for PRESCIENT (Potential eneRgy undErlying Single Cell gradIENTs) for generative modeling of single-cell time-series.
MIT License
44 stars 7 forks source link

random_state.choice value error when passing arguments in simulate/perturbations #6

Closed DanielYuhangLi closed 5 months ago

DanielYuhangLi commented 1 year ago

Hi! Been working through this tool and encountering some issues passing arguments through. I have tried to rerun everything on the vignette data and experiencing some similar issues. One thing that has occurred for me is in the images below (hopefully it loads properly). Basically, when adding in tp_subset argument, I encounter the value error with the random state choice function. Trying to trace back (not too familiar with python) the issue seems to arise when passing the torch file labels for the time points. For some reason, it is not properly being sent as an integer - for example if I use tp 0, the value that is being read is '0'. This also occurs when I add in the tp_subset for the perturbation function as well. Been trying at it for the week and hoping to get some input. Appreciate your time and thanks in advance!

image

DanielYuhangLi commented 1 year ago

When I hard code 0 or any other value into tp_subset, the code will run again. However, I am not sure if the actual output is as expected.

For my own data, I am not sure how to start asking my question. I encounter an error in a similar area of the code, seems like a data input issue again (see image below). Perhaps I am not preparing my input file properly though it is a top variable feature matrix that is transposed like the data file and metadata recoded so timepoints are sequential and I have tried both objects as well as integers for the assigned cluster to no avail. Of note, I am running things separately on a cpu only machine as well with some small tweaks to the code.

image
DanielYuhangLi commented 1 year ago

Just an update, changing the input in the simulate trajectories script to accept tp as integer seemed to resolve this formatting issue. The number of PC's in perturbation analysis is also hard coded as 30 which can cause mismatched matrixes for downstream analysis.

sachitsaksena commented 1 year ago

Major apologies for missing this issue request - somehow we did not get a notification. Thank you for identifying and subsequently finding a solution to your issue! We will try to incorporate the type change and make PCs a variable when time permits.

DanielYuhangLi commented 1 year ago

Thanks so much, actually, I've noticed that when the celltype is a string, I can't enter it alongside an integer for the time point data. I would then have to manually make some changes and my workaround has been just making my celltypes integers as well. Similarly, if I don't enter anything in for those arguments, my hardcoding of the input as integer will break things... my beginner coding skills limits me here.It's difficult for me to really tell if adding these things in alter my initializing cells.

I was curious if you are able to comment on the number of steps that's recommended for the perturbation analysis. I don't have an intuitive sense if I should keep it at 10 or increase things. I am still working on figuring out how to plot the cell proportions from looking at your example code - if there's any guidance on that, would be much appreciated =)

Thanks for your time!

Dan