Closed yyou1996 closed 2 years ago
Thanks for your interest!
The meaning of the files that the code generated: x_y_z.npy stands for hyperedges of size z that has frequencies between x to y. 3_freq folders are temporary files.
For the data that's provided in the zip file: Overall x_y_z are organized the same as above, those with intra_inter in the name is a flag stands for whether this hyperedge is pure intra chromosomal hyperedges or inter-chromosomal hyperedge, those with "filter" in the name is the actual hyperedge. These two files have the same length.
The "intra_inter" files never affect training, it's just used when benchmarking that we calculated auc / accuracy /aupr for intra/inter chromosomal hyperedges separately.
Overall, the main branch in the current repo is maintained in a more readable manner and is compatible with the SPRITE data.
Dear MATCHA authors,
Thanks for your great efforts. I currently try to reproduce your code before I further utilize it. According to the readme https://github.com/ma-compbio/MATCHA/tree/master/History_version#running-command,
the hyperedges with occurrence frequency 2 are not included
which is important for hyperedges of size 5 according to the paper. I thus try to regenerate them with your code.Below is what I regenerate with
process_SPRITE.py
andanalysis_SPRITE.py
.In text:
2_3_3.npy 3_5_3.npy 3_freq 5_8_3.npy 8_12_3.npy dict_3node upper_3.npy
.Below is what I extract from your provided data
occ_3_8.zip
.In text:
3_5_3_intra_inter.npy 3_5_4_intra_inter.npy 3_5_5_intra_inter.npy 3_5_filter_3.npy 3_5_filter_4.npy 3_5_filter_5.npy 5_8_3_intra_inter.npy 5_8_4_intra_inter.npy 5_8_5_intra_inter.npy 5_8_filter_3.npy 5_8_filter_4.npy 5_8_filter_5.npy
.I am wondering about the correspondence between the two sets of files since they are of different names.