SolidBench / SolidBench.js

A benchmark for Solid to simulate vaults with social network data.
MIT License
10 stars 11 forks source link

Combine primary and auxiliary fragmentation steps #15

Closed surilindur closed 3 months ago

surilindur commented 3 months ago

This is an idea I had while working on the multiple interfaces thing, to combine the main and auxiliary fragmentations into one pass. The auxiliary phase is only used to add some noise, and the configs are mostly identical, so it feels like they could be combined.

This would make it easier to, for example, have fragmentation strategies and sinks defined in the main config that would apply to both the main and auxiliary data together, and would remove any challenges resulting from having two different passes potentially writing in the same files or trying to gather a summary of the data and missing out the auxiliary or primary data as a whole.

Any thoughts are welcome, I do not know if this is acceptable from some other points of view or not. It was never clear to me why the auxiliary phase was separate, since in practice it probably does not need to be (the way it is implemented now).

coveralls commented 3 months ago

Pull Request Test Coverage Report for Build 9347007394

Details


Totals Coverage Status
Change from base Build 8938887278: 0.0%
Covered Lines: 112
Relevant Lines: 112

💛 - Coveralls
rubensworks commented 3 months ago

To be honest, I don't remember why we had these two passes 😅 It may just be a historical thing that is not necessary anymore. So if you notice that experiments still run fine with a single pass, I'm definitely open to this.

Just let me know when this is ready for review. Will be a breaking change of course.

surilindur commented 3 months ago

So if you notice that experiments still run fine with a single pass, I'm definitely open to this.

The experiments should run fine, I just double-checked and the output is identical when the phases are combined (file hash-wise, and the files with different hashes - namely WebIDs - just have the triples in a different order).

When this is released, it would require updating the jbr experiment configurations, though, but I also have a branch to do that. I will look into making a PR for it later.

This should be ready for review now!

rubensworks commented 3 months ago

Thanks!

Do let me know if no more changes are needed and a new (major) update should be released.

surilindur commented 3 months ago

I think it might make sense to wait for the multiple interfaces work before releasing the next major. I will try to make a PR later this week.