marcus1487 / nanoraw

Genome guided re-segmention and visualization for raw nanopore sequencing data.
https://pypi.python.org/pypi/nanoraw
Other
46 stars 9 forks source link

The meanings of "--statistics-filename" #53

Closed liuqianhn closed 5 years ago

liuqianhn commented 6 years ago

Hi @marcus1487

May I know what is the format of each line when I provide "--statistics-filename" for "nanoraw plot_most_significant"? I think the first column is for the position, and the rest are for the statistics such as p-value. But I have no idea which column is for what statistics. Could you please explain more? Thanks.

marcus1487 commented 6 years ago

Unfortunately, nanoraw is no longer being supported. It has been superseded by the ONT-supported Tombo package (https://github.com/nanoporetech/tombo).

The statistics file is not intended for external use or made to support external inputs. In order to obtain files for external use please use the write_wiggles command.

liuqianhn commented 6 years ago

Hi @marcus1487

Thanks for your reply. I have used nanoraw plot_most_significant for tens of samples. It is time-consuming for me to check the rank of modifications one by one. I thus just want to code to use p-value automatically. write_wiggles could have different functions. It would be great if you can just simply explain it. Anyway, thanks.

marcus1487 commented 6 years ago

I understand this, but there are several reasons that I have not documented the statistics function. Key among these is that the genomic positions are 0-based as opposed to most genomics coordinates. There are other issues due to how the statistics are processed internally to nanoraw. If you would like to look into this a bit further, here is the place in the code where the statistics file is written out. https://github.com/marcus1487/nanoraw/blob/master/nanoraw/nanoraw_stats.py#L152

I hope this helps!

liuqianhn commented 6 years ago

Hi @marcus1487 Thank you very much. It is very helpful.