koszullab / GRAAL

(check out instaGRAAL for a faster, updated program!) This program is from Marie-Nelly et al., Nature Communications, 2014 (High-quality genome assembly using chromosomal contact data), also Marie-Nelly et al., 2013, PhD thesis (https://www.theses.fr/2013PA066714)
https://research.pasteur.fr/fr/software/graal-software-for-genome-assembly-from-chromosome-contact-frequencies/
14 stars 9 forks source link

No documentation on data preparation #9

Closed cooketho closed 8 years ago

cooketho commented 8 years ago

After fixing several bugs (see pull request), I'm able to successfully run GRAAL on the test data set (trichoderma). But the README is opaque when it comes to formatting and loading my own data set. The two biggest problems are:

1) "(see start_graal.pdf and pending_graal.pdf)". Where are these files? There are referenced multiple times but I don't see them.

2) "A pyramid of contact matrices, P = {M0, M1, ..., Mk}, is a data structure representing the 3C/HiC data at different scales." OK fine. But am I supposed to generate this data structure myself? How are the directory and/or files supposed to be structured? Can GRAAL do it for me? No guidance is provided.

I'm submitting a paper in the next few months and I'd love to use GRAAL and cite your work, but the documentation needs to be improved if I'm going to be able to do that.

rkoszul commented 8 years ago

Hello,

You may generate GRAAL-compatible datasets using HiC-Box. Steps should be documented in that repo's readme. It is true that this was not indicated in GRAAL's, thank you for pointing that out. I added a note that should clarify things.

Should you run into trouble with generating your own datasets using HiC-Box, feel free to open an issue on that repo and we'll look into it and try to improve the documentation accordingly.