smithlabcode / methpipe

A pipeline for analyzing DNA methylation data from bisulfite sequencing.
http://smithlabresearch.org/methpipe
67 stars 27 forks source link

bigWig_to_methcounts --missing file or installation step. #196

Closed Blosberg closed 2 years ago

Blosberg commented 2 years ago

Thanks for providing this software. After unpacking and configuring methpipe-[version].tar.gz I reached the point in the manual advising me to convert bigwig files to methcounts file format using bigWig_to_methcounts.py in the folder METHPIPE_ROOT/src/utils with the name --unfortunately I could not find this file in the directory indicated after unpacking, so I cloned the repository and took the file of that name from the same path.

Proceeding with the instructions, I then get the following error:

$ python bigWig_to_methcounts.py -m S00BHQ51.CPG_methylation_calls.bs_call.GRCh38.20160531.bw -r S00BHQ51.CPG_methylation_calls.bs_cov.GRCh38.20160531.bw -o S00BHQ51.meth -p /path/to/bigWigToBedGraph
Traceback (most recent call last):
  File "bigWig_to_methcounts.py", line 142, in <module>
    main()
  File "bigWig_to_methcounts.py", line 120, in main
    order = cmp(meth_coordinate, read_coordinate)
NameError: name 'cmp' is not defined

Presumably I'm missing some additional files. I see the note at the release link stating the following:

Please make sure to download files methpipe-5.0.1.tar.gz or methpipe-5.0.1.zip and not the "Source code" files provided automatically by github, as these do not contain all necessary files for compilation.

Upon downloading and unpacking the tar for 5.0.1, as well as the previous version, and following analogous install instructions, the file bigWig_to_methcounts.py is still not present in methpipe-5.0.1/src/utils. Is there a missing file? Thanks for any insight you can offer.

andrewdavidsmith commented 2 years ago

@Blosberg I’ll try to make sure the docs and release are properly synchronized. In the meantime, if you can give an indication of what you’re trying to do I might be able to help sooner. I recall there were reasons in the past why it wasnt a good idea to make that file conversion.

Blosberg commented 2 years ago

Thanks Andrew, Ultimately what I want to do is annotate pmds with the pmd function, but in order to do that, I want to make sure my data is properly formatted into the right columns as per the .meth format (I was starting with some bigwig files that I knew well to try and see how things were converted, with the final step being symmetric-cpgs -o human_esc_CpG.meth human_esc.meth -the vignette doesn't show the columns of the output there though ). Can I assume that after symmetric-cpgs the output of the merged meth file ( human_esc_CpG.meth, in the example from the docs) has the following 6 columns:

Chr, start, strand_placeholder, context, F, cov

where Chr = chromosome, start is the position of the "C" on the + strand, strand_placeholder is just a placeholder column from merging, and is always simply "+" context is always "CpG" F is a float denoting the methylation fraction cov is an int denoting reads covering either strand of this site.

If all that's correct, then I will rearrange my existing data into a .meth file with that format and run ./methpipe/pmd on it.

mengzhou commented 2 years ago

Hi @Blosberg , this script is a wrapper of bigWigToBedgraph belonging to UCSC kentUtils. To run it you will need python 2.x and bigWigToBedgraph installed in your system. From the error message you provided I think you will be able to run it using python 2. Since this script was created a long time ago, I'm not sure if its output is still compatible with the current Methpipe release.

Blosberg commented 2 years ago

Hi @mengzhou Thanks very much. Using python 2 I was able to convert the files I knew and inspect the full process. I guess my real question was about the format of .meth files (sorry if my question was a bit of an XY problem). I can see the pmds output now and understand how they were calculated. Thanks for your help!