Closed guillaumecharbonnier closed 5 years ago
This is an interesting point and workaround. This would mean also that we may add something in the plot title to tell the user that this is a selection of annotations. Maybe we could also switch to a radar plot that maybe would be more suited as the number of features increases (?).
Le mer. 3 avr. 2019 22:42, guillaumecharbonnier notifications@github.com a écrit :
Currently, we get this error:
|-- 22:11-ERROR-ologram : The selected key in --more-keys should be associated with less than 50 different values.
Obviously the current plot layout can not be printed in such situation but at least we could produce the table. Then we may think of a strategy to still display something. Maybe display the best 20 values according to their adjusted p-value?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dputhier/pygtftk/issues/79, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvxHm5IOZmncvnnHnyUylywls23Yojaks5vdRJDgaJpZM4cbZuf .
The is no radar plot in plotnine at the moment. I will open an issue to know whether this is something ongoing...
Question : is the 50 value for the threshold completely arbitrary ? Because we can always provide a very wide barplot and let the user trim the resulting image.
This is arbitrary. I think there is a limitation in PDF width
Le jeu. 4 avr. 2019 13:16, Quentin Ferré notifications@github.com a écrit :
Question : is the 50 value for the threshold completely arbitrary ? Because we can always provide a very wide barplot and let the user trim the resulting image.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dputhier/pygtftk/issues/79#issuecomment-479855998, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvxHvZWNMliYHNOSqSRpZhnDdTI9WUdks5vdd8mgaJpZM4cbZuf .
The limitation in pdf width is arbitrary as well if I recall correctly ?
Unless I misunderstand how you plan to use the radar plot, I think a volcano plot with p-val and FC with test_repel for interesting outliers should be suitable when testing more than ~50 motifs.
Yep.
Le jeu. 4 avr. 2019 à 14:08, guillaumecharbonnier notifications@github.com a écrit :
Unless I misunderstand how you plan to use the radar plot, I think a volcano plot with p-val and FC with test_repel for interesting outliers should be suitable when testing more than ~50 motifs.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dputhier/pygtftk/issues/79#issuecomment-479870924, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvxHpjN1vW57YV_eAPP06-ONrgkslz_ks5vdetVgaJpZM4cbZuf .
Denis Puthier laboratoire INSERM TAGC/INSERM U 1090 Parc Scientifique de Luminy case 928 163, avenue de Luminy 13288 MARSEILLE cedex 09 FRANCE Mail: denis.puthier@univ-amu.fr Tel: (National) 04 91 82 87 31 / (International) 33 4 91 82 87 31 Fax: (National) 04 91 82 87 01 / (International) 33 4 91 82 87 01
Web:
http://tagc.univ-mrs.fr/tagc/index.php/research/network-bioinformatics/dputhier
====================================================================
Just reporting that the current plot code can display for way more than 50 keys before hitting the pdf width limit.
Actually, the only reason the plot is messed up is because feature_type for "--more-keys" is currently the combination of the key and the value separated by a line return. @dputhier @qferre Is there a reason for that or can we switch to another separator eg ": "?
Also, can I add a third plot on the current pdf output with the FC metric? Current metrics put a visual emphasis on big features and user may be more interested in comparing which features have the highest enrichment bias.
Yes. For sure you can implement additional plots. The volcano for instance may be a good choice. If you look at the code you will see that the plotting part needs some refactoring. In fact it would require to melt properly the dataframe once so that all plot could be done on the same dataframe...
Yes, the feature_type is currently the combination of the key and the value separated by a line return. We had chosen this solution to avoid very long names in the plot which were also messing up the diagram...
The error was removed in dad7be338cac2f156cec15494bae1b44550f73d2. Fixed.
Currently, we get this error:
Obviously the current plot layout can not be printed in such situation but at least we could produce the table. Then we may think of a strategy to still display something. Maybe display the best 20 values according to their adjusted p-value?