vigna / webgraph-rs

A Rust port of the WebGraph framework
Apache License 2.0
32 stars 6 forks source link

Missing properties from `graph.stats` #109

Closed progval closed 2 months ago

progval commented 2 months ago

The Rust rewrite of permute removed a bunch of properties from graph.stats: https://github.com/vigna/webgraph-big/blob/51f1ebec826397a79c500e8c09e7ae7192015285/src/it/unimi/dsi/big/webgraph/BVGraph.java#L2490-L2520 -> https://github.com/vigna/webgraph-rs/blob/af48d3b8f98b2378ebf50e2ea536947bd60c65d8/src/graphs/bvgraph/comp/flags.rs#L89-L170

Are you planning to re-add them?

vigna commented 2 months ago

Yes, but the idea was that of putting them elsewhere rather than in the compression code

vigna commented 2 months ago

BTW is there any specific stat you had in mind?

On August 1, 2024 11:51:43 AM GMT+02:00, Val Lorentz @.***> wrote:

The Rust rewrite of permute removed a bunch of properties from graph.stats: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fvigna%2Fwebgraph-big%2Fblob%2F51f1ebec826397a79c500e8c09e7ae7192015285%2Fsrc%2Fit%2Funimi%2Fdsi%2Fbig%2Fwebgraph%2FBVGraph.java%23L2490-L2520&data=05%7C02%7Cvigna%40di.unimi.it%7C6498c108cf344b7dc9a008dcb20f8d65%7C13b55eef70184674a3d7cc0db06d545c%7C0%7C0%7C638581027074200949%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=9zds9ErOPuulHIpMhl4n6ZIDto1Hi3VGtCvkZrcJAC4%3D&reserved=0 -> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fvigna%2Fwebgraph-rs%2Fblob%2Faf48d3b8f98b2378ebf50e2ea536947bd60c65d8%2Fsrc%2Fgraphs%2Fbvgraph%2Fcomp%2Fflags.rs%23L89-L170&data=05%7C02%7Cvigna%40di.unimi.it%7C6498c108cf344b7dc9a008dcb20f8d65%7C13b55eef70184674a3d7cc0db06d545c%7C0%7C0%7C638581027074357161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=o8zdzLq5iKYmdmqIUF8K%2BBJdbdgTiBRCqJFLfrcQI54%3D&reserved=0

Are you planning to re-add them?

-- Reply to this email directly or view it on GitHub: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fvigna%2Fwebgraph-rs%2Fissues%2F109&data=05%7C02%7Cvigna%40di.unimi.it%7C6498c108cf344b7dc9a008dcb20f8d65%7C13b55eef70184674a3d7cc0db06d545c%7C0%7C0%7C638581027074357161%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=mQutdjFaQruB%2FnN3KNfKOAsXzsRXfDhmqWHQm0ZAYL8%3D&reserved=0 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

progval commented 2 months ago

compratio, bitspernode, and bitsperlink seem to be the only ones we used at any point at SWH

vigna commented 2 months ago

Oh wow. We don't do even do those? That's bad—I thought the distribution of gaps or such. That should be easy to fix.

zommiommy commented 2 months ago

I'll add them, compress ratio is computed based on what?

vigna commented 2 months ago

It is computed with respect to the information-theoretical lower bound. You can have a look at the Java code.

zommiommy commented 2 months ago

Done, I checked that on cnr2000 all the parameters match, the only difference is the number of decimal places; On Java is 3 decimals, in this impl is not truncated so ~15.

Should I also truncate it to 3 decimals?

I also added the length field, which contains the length in bits of the compressed graph.