mattflor / chorddiag

R interface to D3 chord diagrams
159 stars 44 forks source link

creating a matrix #15

Open mictadlo opened 6 years ago

mictadlo commented 6 years ago

Unfortunately, I do not understand how could create a matrix for my data. There separated in 2 json files. The first one describe the circle labeling, color and size

[
  {
    "color": "#996600", 
    "id": "chr03", 
    "len": 35020413, 
    "label": "chr03"
  }, 
  {
    "color": "#666600", 
    "id": "tig00007144", 
    "len": 40868, 
    "label": "tig00007144"
  }, 
  {
    "color": "#666600", 
    "id": "tig00026480", 
    "len": 95961, 
    "label": "tig00026480"
  },
...
]

On the other hand, the second file describes the relationship between each label in the chord chart.

[
  {
    "source": {
      "start": 30824, 
      "end": 23113, 
      "id": "tig00007144"
    }, 
    "target": {
      "start": 33203431, 
      "end": 33211142, 
      "id": "chr03"
    }
  }, 
  {
    "source": {
      "start": 48387, 
      "end": 1, 
      "id": "tig00026480"
    }, 
    "target": {
      "start": 35010628, 
      "end": 34962190, 
      "id": "chr03"
    }
  }, 
...
]

How do I convert the above relationship file to a proper matrix?

Thank you in advance.

mattflor commented 6 years ago

Well, as this is an R wrapper around some javascript code you need to provide your data as an R matrix. Here's some code from the README that should give you an idea (think of 'have' as the source and of 'prefer' as the target):

m <- matrix(c(11975,  5871, 8916, 2868,
              1951, 10048, 2060, 6171,
              8010, 16145, 8090, 8045,
              1013,   990,  940, 6907),
            byrow = TRUE,
            nrow = 4, ncol = 4)
haircolors <- c("black", "blonde", "brown", "red")
dimnames(m) <- list(have = haircolors,
                    prefer = haircolors)
m
#>         prefer
#> have     black blonde brown  red
#>   black  11975   5871  8916 2868
#>   blonde  1951  10048  2060 6171
#>   brown   8010  16145  8090 8045
#>   red     1013    990   940 6907
mictadlo commented 6 years ago

Hi, Thank you for your example but I still do not how to convert my data to the required matrix.

mattflor commented 6 years ago

Start by converting your json files to data frames, e.g using the jsonlite R package.

mictadlo commented 6 years ago

Without any coding how would the matrix look like from the above 2 JSON files?

mattflor commented 6 years ago

You may want to look at the d3 chord layout: https://github.com/d3/d3-chord

[...] each matrix[i][j] represents the flow from the ith node in the network to the jth node. Each number matrix[i][j] must be nonnegative, though it can be zero if there is no flow from node i to node j.

I.e. the chord layout handles everything automatically: It calculates total node/group sizes and chord start and end positions, so you don't specify those explicitely...

mictadlo commented 6 years ago

Thank you for your link but I still do not understand how the matrix should look like for my data.

mattflor commented 6 years ago

Sorry, show me some code of what you have tried so far, and some more explanation of what your data means. Your questions are too unspecific otherwise.

mictadlo commented 6 years ago

One of them is the reference file (chr03) which contains one long sequence (35020413 character long) and other file contained multiple sequence with different length and different ids (tig00007144, tig00026480,...). I used an alignment software called BLAST to align these two sequences. The original alignment results are stored in tab separated file:

tig00007144 chr03   23113   30824   33203431    33211142
tig00026480 chr03   1   48387   35010628    34962190
tig00003221 chr03   16916   29961   2127862 2140878
tig00010111 chr03   218 6989    23106738    23113500
tig00000318 chr03   1   18244   28621116    28639312
tig00009327 chr03   32147   40878   34160279    34151526
tig00025208 chr03   65878   79311   17006900    17020370
tig00019172 chr03   43720   50583   23113500    23106638
tig00004923 chr03   44154   50849   21159875    21153164

This is the column explanation:

I would like to see which parts (tig00007144, tig00026480, ...) mapped where to chr03.

Thank you in advance.

mattflor commented 6 years ago

I'm afraid the chorddiag package is not suited for your purpose. You may be able to misuse it to some (but certainly not satisfactory) degree. I would suggest you look for another tool.

jelias1 commented 6 years ago

Mictadlo, your data reminded me of an example in the RCircos package. Here is some documentation on that package.