jon-xu / scSplit

Genotype-free demultiplexing of pooled single-cell RNA-Seq, using a hidden state model for identifying genetically distinct samples within a mixed population.
MIT License
39 stars 9 forks source link

main.py ValueError: setting an array element with a sequence #3

Closed drneavin closed 5 years ago

drneavin commented 5 years ago

Hello!

I am trying to run scSplit to demultiplex samples on which I do not have SNP genotype data. I have successfully run matrices.py and received the alt and ref csvs. However, when I try to run main.py, I receive an error. Here is the complete output of the error:

Traceback (most recent call last): File "/home/drenea/venv/3.6/lib/python3.6/site-packages/scSplit/main.py", line 356, in main() File "/home/drenea/venv/3.6/lib/python3.6/site-packages/scSplit/main.py", line 334, in main model.distinguishing_alleles(pos) File "/home/drenea/venv/3.6/lib/python3.6/site-packages/scSplit/main.py", line 249, in distinguishing_alleles proj[i, j] = np.matmul(U[:,i], subt.iloc[j]) ValueError: setting an array element with a sequence.

I have checked my csv files generated from matrices.py and they appear to be what I would expect. The command that I'm running is: python $scSplitDIR/main.py -r ref_filtered.csv -a alt_filtered.csv -n 5

I do receive an scSplit.log file which starts with these lines: Starting data collection: 2019-06-24 17:03:32.422419 Allele counts matrices uploaded: 2019-06-24 17:03:34.924011

And then in records ~10,000 iterations but no other files from the run are generated. I am running with the pip installed version of scSplit but also tried with the main.py from Github. I am running with R 3.6.5.

Any help would be much appreciated!

-Drew

AmandaKedaigle commented 5 years ago

I'm getting the same error and would love some help as well! I tried the pip version and GitHub version, and also tried with and without providing it a vcf file, to no avail.

jon-xu commented 5 years ago

There was a bug and was fixed in the newest github release. Will update pip version later.

drneavin commented 5 years ago

Thanks Jon, the new script for main.py that you updated works well.

In addition, I would like to match the samples across the batches using SNP genotype data. Would there be a way for me to have the code report the SNP genotypes for each cluster across more than just the distinguishing variants? In other words, could I have the script output a matrix of all the SNPs with the cluster SNP genotypes?

jon-xu commented 5 years ago

Good to hear @drneavin ! For the genotype question, please refer to new thread #4

jon-xu commented 5 years ago

@AmandaKedaigle sorry, did it solve your issue as well, please?

AmandaKedaigle commented 5 years ago

@AmandaKedaigle sorry, did it solve your issue as well, please?

Yes it did, it's working for me now as well. Thanks!

jon-xu commented 5 years ago

Great to know! @AmandaKedaigle