graph-genome / component_segmentation

Read in ODGI Bin output and identify co-linear components
Apache License 2.0
3 stars 4 forks source link

Put coverage, inversion and position into JSON instead of Bool Vals for Occupation. #3

Closed subwaystation closed 4 years ago

subwaystation commented 4 years ago

Quoting @josiahseaman "I was thinking about next steps for development. We've Updated the JSON format with occupants, then we could expand that, we essentially have a file format version. The next obvious step is to add in the matrix information, which we have in segmentation because components contain multiple bins. It seems like we may eventually want to contain everything from Erik's bins: coverage, inversion, position."

subwaystation commented 4 years ago

"We can leave position alone right now to be handled (probably separately) by a coordinate conversion service. So that leaves coloring by inversion and copy number (coverage) instead of having any sort of a hard cutoff ( coverage > 0.1) which was just a temporary thing. Do we still need occupants for anything if we already have the more information rich matrix? So I'm thinking: "first_bin": 43, "last_bin": 45, "matrix": [ [(.7, 0.02), (.5, 0.01), (.8, 1.0)], # individual with last bin inverted [(.8, 0.03), (.6, 0.01), (.7, 0.0)], # individual with no inversion for individual in matrix: for bin in individual: coverage, inversion = bin

Then the ordering of individuals would match schematic.path_names, so we don't have to repeat them"

josiahseaman commented 4 years ago