A computational library for learning and evaluating biological knowledge graph embeddings - please see the main PyKEEN repo at https://github.com/pykeen/pykeen/
The EXCAPE-DB (manuscript, data download )is the easiest database to use with chemogenomic data - it's actually the pinacle of curation and preprocessing.
Until now, I've asked students to work on this but they never realized how important it was, so I will finish the corresponding bio2bel repository myself and then we will have the best data set for this that exists.
The thing is, it's very important to consider the IC50 values associated with each edge. How would that work in to the available models, if even at all? Assigning a hard cutoff is not a good idea, since it would throw away incredible amounts of information. Maybe we could bin, but then we would have to introduce some sort of notion of ordering of edges into the model as well.
The EXCAPE-DB (manuscript, data download )is the easiest database to use with chemogenomic data - it's actually the pinacle of curation and preprocessing.
Until now, I've asked students to work on this but they never realized how important it was, so I will finish the corresponding bio2bel repository myself and then we will have the best data set for this that exists.
The thing is, it's very important to consider the IC50 values associated with each edge. How would that work in to the available models, if even at all? Assigning a hard cutoff is not a good idea, since it would throw away incredible amounts of information. Maybe we could bin, but then we would have to introduce some sort of notion of ordering of edges into the model as well.