ssc-oscar / lookup

A mirror of bitbucket.org/swcs/lookup
1 stars 4 forks source link

Tracing how p2P mapping is created? #23

Open k----n opened 3 years ago

k----n commented 3 years ago

I tried to trace the following p2P mapping of rocker-org_rocker:

First I get the mapping:

> echo rocker-org_rocker | getValues p2P
rocker-org_rocker;04n0_docker

Next I get the difference in lines:

Lines added:
> diff --changed-group-format='%>' --unchanged-group-format='' <(echo rocker-org_rocker | getValues -f p2c | cut -d\; -f2 | sort -u) <(echo 04n0_docker | getValues -f p2c | cut -d\; -f2 | sort -u)  | wc -l
779

Lines deleted:
> diff --changed-group-format='%<' --unchanged-group-format='' <(echo rocker-org_rocker | getValues -f p2c | cut -d\; -f2 | sort -u) <(echo 04n0_docker | getValues -f p2c | cut -d\; -f2 | sort -u)  | wc -l
521

Now I look at the commit counts of both projects:

> echo rocker-org_rocker | getValues -f p2c | wc -l
521

>echo 04n0_docker | getValues -f p2c | wc -l
779

So looks like the P (04n0_docker) and p (rocker-org_rocker) are not related by commits in anyway...

I assume that P2p and p2P mappings are based off of this paper where they should be sharing commits https://arxiv.org/pdf/2002.02707.pdf

audrism commented 3 years ago

The community detection works on shared commits. To be in the same community a and b do need to share a commit but share a lot of commits with, for example, c each. Community is a graph, not a pairwise operation.

k----n commented 3 years ago

What's the most efficient way to find c?

Is there some way to visualize these communities? If no, I think it's a good hackathon project.

audrism commented 3 years ago

To be precise, there may not be a c, but something further down the network. If you want to get a c, you can do two things: getNeighbors (new api) gives the neighborhood for certain depth. But what you need may be more of getPath a b type which gives you the shortest path between a and b via entities of type p. That would be a great hackathon project indeed.

Perhaps Mahmoud and Elena can extend their visualization for such cases.

Also, on worldofcode.org, you can produce rudimentary graphs in the lookup search application that provides navigation among various types of entities

k----n commented 3 years ago

Is there a visualization for the rudimentary graph on worldofcode.org? It doesn't really seem different than just using CLI to me.

Yes it looks like https://github.com/woc-hack/collab-graph can be extended, but maps need to be generated.

I think I read somewhere that neo4j has been tried before but the data was too large, I think having a graph based query language to look up relationships communities might be a good idea.

You might also be interested in this for graphing (it's a company, but they have good ideas about using graphs/making data accessible): https://linkurio.us/

P.S. http://worldofcode.org (returns 404) does not redirect to the HTTPS version https://worldofcode.org

audrism commented 3 years ago

Rudimentary graphs: select some object, then you will see graph buttons that display a small neighbourhood: https://worldofcode.org/lookupresult?sha1=f883e20ee22c3a2847d24bca369adf63835987f3&type=commit

Currently generating large graphs is a bit slow, but have some experts to work on it in the future.

Thank you, fixed http redirect

k----n commented 3 years ago

Neat! Is there a legend for the graph?

I think the red is the commit, but what are the other colors?

audrism commented 3 years ago

Red is probably a project, I added questions about colors/labeling to DRE repo