Open EmiKib opened 1 month ago
for reference: https://github.com/CPernet/metaprivBIDS
OK, I wanted to try my luck today, but failed. I have trouble installing metaprivBIDS according to the instructions (both trying the documentation of @CPernet 's github, i.e. the README which seems incomplete) and the getting_started rst file in the docs. There seems to be an issue with installing/building pygraphviz. Giving up for now.
@schoffelen does it fail at "conda install graphviz pygraphviz" or at "pip install -e . " ?
@schoffelen does it fail at "conda install graphviz pygraphviz" or at "pip install -e . " ?
Hi @EmiKib thanks for getting back about this. I must confess that I haven't tried 'conda install', I did pip install for the graphviz etc.
I freshly cafeinated myself (and the compute cluster), and tried again just yet. With conda install I make it through the installation process. Thanks! PS: you may consider to add a line to the README.md to cd
into the metaPrivBIDS repo (after git cloning it) before calling pip install -e .
@schoffelen great that it works now. I will add that cd
line now.
If you have any suggestions in regards to anything in the app (design, buttons, functionality), albeit I am currently working on making it a bit more visually pleasing, please let me know.
I am currently on holiday, but I will try to be fast at replying.
@schoffelen great that it works now. I will add that
cd
line now. If you have any suggestions in regards to anything in the app (design, buttons, functionality), albeit I am currently working on making it a bit more visually pleasing, please let me know. I am currently on holiday, but I will try to be fast at replying.
Thanks, I will play around a bit. Don't bother replying during your holiday. Enjoy!
An addition to the .tsv scrambler could be to look into how much leakage there is in a scrambled dataset.
Simple metrics could be full row leakage & partial row leakage with the option to set a value representing NaN.
In addition running a comparison between the Privacy Information Factor (implemented in the metaprivBIDS application) specifically the Field Information gain, would give an overview over how the original and srambled dataset assess the information gain for the individual columns. In that way the user has an overview over much information is retained in their scrambled dataset compared to their new. This would most often be almost the same as the permutation is row based, but it gives a quantifiable metric the user can refer back to.
Lastly a comparison of the scrambled and original datasets correlation matrix between columns could be an option to ensure that the dataset has been permuted to a satisfactory level.
I hope this could be useful? If so, I could try to add it so it can run with the pipeline or as a separate pipeline.