adoebley / Griffin

A flexible framework for nucleosome profiling of cell-free DNA
Other
24 stars 16 forks source link

difficulties using your code #6

Closed bzip2 closed 1 year ago

bzip2 commented 2 years ago

Hi.

It would be great to apply your methods and code to new data, but I'm finding it very difficult. I've looked through all the scripts (yaml, Python and snakefile, though not the notebooks) in this repo and in Griffin_analyses.

If you'd like others to use your code (and cite your paper), and it makes things easier, I'd suggest dropping snakemake.

adoebley commented 1 year ago

Hi,

Thank you for reaching out with the comments and suggestions and sorry for the delay in responding! I'd recommend you try running the demo (in the wiki) since I think it might answer a lot of your questions.

I've updated the readme to reflect the steps in the current version of the pipeline (we're in the process of doing revisions and some things have changed, hopefully we should have the final version of the paper available in the near future). The current order of steps is:

  1. griffin_genome_GC_frequncy (already complete if you're using hg38)
  2. griffin_GC_and_mappability_correction (mappability correction is turned off by default)
  3. griffin_nucleosome_profiling

griffin_filter_sites is no longer part of the pipeline and I removed it from the readme.

The mappable regions file used for "--mappable_regions_path" in the griffin_GC_and_mappability_correction step is specified in the config: mappable_regions: ../../Ref/k100_minus_exclusion_lists.mappable_regions.hg38.bed This contains each position with a mappability score of 1 that doesn't overlap centromeres, gaps, fix patches, alternative haplotypes, and excluded regions. We took out the repeat masker filter because we found that we could get rid of the regions that were causing problems with the above filters rather than needing to exclude all repeats. I've removed it from the readme.

And we are working on a WDL pipeline as an alternative to the snakemake but I think it's going to be a while before that is available.

Let me know if you have any further questions.