Closed Kai6662 closed 4 years ago
Hi,
could you please provide the exact error message that you get?
And, if possible, the head of your celltype labels file and your count table - so I can check whether they fulfill the requirements.
Kevin
Hi,
Please see the attached files. Thank you.
Best regards,
KaiK new.single.cell.expressioncelltypes.txt <https://drive.google.com/file/d/1Knsicn2FtCSyAR-MyJCW7d33KLYNNvH/view?usp=drive_web> new.single.cell.expression_norm_counts_all.txt https://drive.google.com/file/d/1_zbC8ETHoptvEtjT8U4obGwQ4gO_q2EM/view?usp=drive_web
On Thu, Jan 9, 2020 at 8:45 AM Kevin Menden notifications@github.com wrote:
Hi,
could you please provide the exact error message that you get?
And, if possible, the head of your celltype labels file and your count table - so I can check whether they fulfill the requirements.
Kevin
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KevinMenden/scaden/issues/22?email_source=notifications&email_token=AMSN33NSDTQQK4B6N4U6HMDQ43IYJA5CNFSM4KELHMI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIPKPSA#issuecomment-572434376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSN33NGTBZS46MN6MZYEBDQ43IYJANCNFSM4KELHMIQ .
Hi Kai,
thanks for sending those through. You just have to transpose your expression matrix to the form rows = samples columns = genes and then it should work.
However, I found a small bug in the bulk_simulation.py script while testing your data, which I thought I had fixed actually... I quickly fixed it, so if you just clone the repository and use the corrected version of the 'bulk_simulation.py' script, it should be working fine!
And for you're convenience, here's a link to the transposed expression matrix, which works together with your cell type labels (you'll have to change the name again, of course). https://drive.google.com/file/d/1qDitSaHb2nAkLHmDe6Ad5TGQkLYgjzq1/view?usp=sharing
Hope that solves things!
Kevin
Hi Kevin,
I tried. I clone the repository and try it again. It still have problem~~
"
python /hpc/dhl_ec/kcui/scaden/scaden/preprocessing/bulk_simulation.py
--cells 100 --samples 50 --data
/hpc/dhl_ec/kcui/deconvolution/2.Scaden/plaque/only_plaque
Datasets: []
Traceback (most recent call last):
File "/hpc/dhl_ec/kcui/scaden/scaden/preprocessing/bulk_simulation.py",
line 281, in
Best regards, Kai
On Thu, Jan 9, 2020 at 11:42 AM Kevin Menden notifications@github.com wrote:
Closed #22 https://github.com/KevinMenden/scaden/issues/22.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KevinMenden/scaden/issues/22?email_source=notifications&email_token=AMSN33MKEJWQM3COVBGXKPTQ435SVA5CNFSM4KELHMI2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOV3UFKVI#event-2934461781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSN33NF4KR5BIYLDSI2AVTQ435SVANCNFSM4KELHMIQ .
The problem is that the code can't find the datasets you're using.
You see this here:
Datasets: []
The empty brackets mean it could not find a usable dataset.
The program is looking for files in the directory you give it, that match the pattern that you specific (by default its '*_norm_counts_all.txt'. Maybe try adding a '/' to the end of your directory:
--data /hpc/dhl_ec/kcui/deconvolution/2.Scaden/plaque/only_plaque/'
Hi Kevin,
It is working now! Thanks for your help.
Best regards, Kai
On Thu, Jan 9, 2020 at 2:29 PM Kevin Menden notifications@github.com wrote:
The problem is that the code can't find the datasets you're using. You see this here: Datasets: [] The empty brackets mean it could not find a usable dataset.
The program is looking for files in the directory you give it, that match the pattern that you specific (by default its '*_norm_counts_all.txt'. Maybe try adding a '/' to the end of your directory:
`--data /hpc/dhl_ec/kcui/deconvolution/2.Scaden/plaque/only_plaque/'
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KevinMenden/scaden/issues/22?email_source=notifications&email_token=AMSN33NZH52AUYM6SB2QB2DQ44RCDA5CNFSM4KELHMI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIQJOJA#issuecomment-572561188, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSN33PUF7JJUPPL5F45C4LQ44RCDANCNFSM4KELHMIQ .
Great :-)
No problem!
Hi,
I am using your software. But the results from Scaden and MuSiC display a big difference. And I am looking for a way to measure the results. Why it is so different? Do you have any idea about that?
Best regards, Kai
[image: image.png]
On Thu, Jan 9, 2020 at 8:45 AM Kevin Menden notifications@github.com wrote:
Hi,
could you please provide the exact error message that you get?
And, if possible, the head of your celltype labels file and your count table - so I can check whether they fulfill the requirements.
Kevin
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KevinMenden/scaden/issues/22?email_source=notifications&email_token=AMSN33NSDTQQK4B6N4U6HMDQ43IYJA5CNFSM4KELHMI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIPKPSA#issuecomment-572434376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMSN33NGTBZS46MN6MZYEBDQ43IYJANCNFSM4KELHMIQ .
Hi Kai,
it's of course not quite reassuring to get different results from two algorithms. With MuSiC, we observed that it sometimes gives quite significantly wrong predictions, and we didn't know why. However, that's not to say that it doesn't work, as it achieved quite good performance on other datasets - similar to Scaden.
Of course I am biased and am inclined to say you can trust Scaden :) but in this case, I would actually use CIBERSORTx and see what kind of results this gives. If it is similar to one of the other algorithms, then that's probably the best prediction.
Hope that helps!
Best, Kevin
Hi Kevin,
Thank you so much!
Best, Kai
Sent from my iPhone
On Mar 11, 2020, at 11:15 AM, Kevin Menden notifications@github.com wrote:
Hi Kai,
it's of course not quite reassuring to get different results from two algorithms. With MuSiC, we observed that it sometimes gives quite significantly wrong predictions, and we didn't know why. However, that's not to say that it doesn't work, as it achieved quite good performance on other datasets - similar to Scaden.
Of course I am biased and am inclined to say you can trust Scaden :) but in this case, I would actually use CIBERSORTx and see what kind of results this gives. If it is similar to one of the other algorithms, then that's probably the best prediction.
Hope that helps!
Best, Kevin
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi,
I processed my scRNA-seq dataset(s) that I want to use for training. I used Seurat for this and got celltype labels. Then I created two input files( _norm_counts_all.txt for the count data, _celltypes.txt for the cell type labels ). But when I use bulk_simulation.py to do Bulk simulation. It have an error : IndexError: list index out of range. But I don't think my files have problems. What is the problem?