hylasD / tSpace

3 stars 1 forks source link

Running tSPACE on large datasets. Issues with igraph: Weight vector must be non-negative #1

Open ccp77 opened 5 years ago

ccp77 commented 5 years ago

Hi,

Thanks a lot for your great work. I'm trying to run tSpace on a dataset of 200'000 cells x 15 PCs (on Mac iOS). However I get the following error [Error in { : task 1 failed - "At structural_properties.c:4295 : Weight vector must be non-negative, Invalid value"], which I believe is a problem with the igraph dependency. When I run tSpace on a downsample of the dataset (2'000 cells x 15 PCs), I'm able to obtain my ts_file. I tried to install previous version of igraph (1.1.2) as suggested, but I still get the same error. Thank you very much!

hylasD commented 5 years ago

Hi ccp77

Thanks for using the algorithm, and I’m sorry that you are facing issues. I’ll do my best to help you proceed with the analysis.

I would recommend to run 200k cell dataset on a server with more CPUs so the calculations can be faster and more efficient.

From the Error type you report it seems that some of the weights in the graph are negative values and distance calculation using Dijkstra algorithm in igraph fails, namely values should be positive. When you downsample your data you probably lost some of the troubling connections with negative weights. I don’t think it’s the issue with the igraph version but rather the presence of negative values. Could you please send me your code and at least summary of your data that you feed into tSpace function.

Alternatively I can take a look at the data and identify problematic cells (events) if you want. It would be great to see performance on such a abundant data. I run personally up to 500k cells.

Please let me know if that works for you. Cheers Denis

On Mar 22, 2019, at 12:10 AM, ccp77 notifications@github.com wrote:

Hi,

Thanks a lot for your great work. I'm trying to run tSpace on a dataset of 200'000 cells x 15 PCs (on Mac iOS). However I get the following error [Error in { : task 1 failed - "At structural_properties.c:4295 : Weight vector must be non-negative, Invalid value"], which I believe is a problem with the igraph dependency. When I run tSpace on a downsample of the dataset (2'000 cells x 15 PCs), I'm able to obtain my ts_file. I tried to install previous version of igraph (1.1.2) as suggested, but I still get the same error. Thank you very much!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

ccp77 commented 5 years ago

Hi Denis,

Thank you very much for your reply. Yes I would be happy if you could have a look at my data. Please find attached my matrix with the 15 PCs (Flow data auto-logicle transformation) and my script. Thank you very much! Best,

Cecile

hylasD commented 5 years ago

Hi Cecile,

I have had a busy week, but hopefully will take a look on your data over the weekend. Cheers Denis

On Mar 24, 2019, at 8:59 PM, ccp77 notifications@github.com wrote:

Hi Denis,

Thank you very much for your reply. Yes I would be happy if you could have a look at my data. Please find attached my matrix with the 15 PCs (Flow data auto-logicle transformation) and my script. Thank you very much! Best, Archive.zip

Cecile

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ccp77 commented 5 years ago

Hi Denis,

Yes no problem, thank you very much. I have been using tSpace to explore some scRNAseq data as well, and I am very excited with the results. It would be great to do the same with these corresponding flow data ! Best, Cécile

On 30 Mar 2019, at 00:36, hylas notifications@github.com wrote:

Hi Cecile,

I have had a busy week, but hopefully will take a look on your data over the weekend. Cheers Denis

On Mar 24, 2019, at 8:59 PM, ccp77 notifications@github.com wrote:

Hi Denis,

Thank you very much for your reply. Yes I would be happy if you could have a look at my data. Please find attached my matrix with the 15 PCs (Flow data auto-logicle transformation) and my script. Thank you very much! Best, Archive.zip

Cecile

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hylasD/tSpace/issues/1#issuecomment-478065851, or mute the thread https://github.com/notifications/unsubscribe-auth/Aujwp82xgEhki-_zEL1OUEWdZMh1l-PHks5vbkEkgaJpZM4cC-Bl.

hylasD commented 5 years ago

Hi ccp70,

I examined your data and indeed, as I suspected during knn graph calculation, few cell-cell pairs have negative distances, which I have not seen before in biological data. Shortest distances path algorithm does not accept negative distances, and that is the reason for tSpace to fail. When examining the values in detail these are very close to zero (all are with the exponent -15), so I would just suggest to proceed with the data as it is and use updated tSpace version.

I modified the core tSpace algorithm to report a message if negative distances are detected. They will be automatically approximated to zero, and trajectory inference analysis will run until the end. Additionally all cell-cell pairs with negative distances will be reported so user can examine them.

Please update the tSpace version and let me know if you run in to any other issues. Please let me know if I can close this issue as solved.

Cheers