Closed PartheshSoni closed 3 years ago
Hi,
There's isn't a way to do this directly in logomaker, but what I can suggest is the following: you can modify the function alignmnet_to_matrix in logomaker to give you counts of multiple characters at a position, as in your task. I've pasted a link to the method below. If you do it this way, I think you'll have to update the loop starting on line 582. Once you have an updated dataframe, you can draw it with logomaker.
During development we wrote methods for drawing dinucleotide logos (e.g., see below) but these currently aren't implemented. I hope this helps. I am not sure about other libraries.
Thanks, that makes sense. But I am giving dataframes to Logo class, which contains cols as multi-letter strings and I am getting exception that multi-letters are not supported. Removing that exception clause and some other related clauses, I am able to plot as required but then the color of the letters is in grayscale. To resolve this, I am passing a dict containing color scheme (letter->color mapping dict), and removing the _expand_color_dict(), I am able to solve the issue.
Great, I'll close the issue.
I basically want to find a sequence motif, taking into consideration kmers (length can be anything between 1 and maybe 5, 6), instead of single characters. So, I want to find the frequency of kmer at each position. Any idea how I can do that, using logomaker, or any other library?