nf-core / atacseq

ATAC-seq peak-calling and QC analysis pipeline
https://nf-co.re/atacseq
MIT License
188 stars 117 forks source link

bug in the plot_homer_annotatepeaks.r script #86

Closed anvlasova closed 4 years ago

anvlasova commented 4 years ago

Hi,

Some annotations containing special characters in the gene names, for example in dm3 genome some genes have " ' '' symbol in the gene name.

Function 'read.table' in the script plot_homer_annotatepeaks.r can't properly read such files and the whole pipeline crashes in the 'merge_library_macs_qc' step. While function read.csv can process those files without problems. Guess this is because of default quote and comment char options that are different between these two functions: https://stackoverflow.com/questions/12828438/read-csv-vs-read-table

Can you please fix it?

thanks, Anna

drpatelh commented 4 years ago

Hi @anvlasova Thanks for reporting this :+1:

Would you mind uploading a file I can use to test this please? Alternatively, please feel free to submit a PR with a fix to the dev branch. The pipeline is pretty much ready for the next release so it will be fixed in the next version.

anvlasova commented 4 years ago

Hi @drpatelh,

we found this problem with this annotation file ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.57_FB2014_03/gff/dmel-all-filtered-r5.57.gff.gz Some genes have a " ' " character in their names, i.e. gene "Name=beta'COP " on the chr 2L:13378902-13382382 and this causes the problem. i will try to contribute to the dev branch,

thanks, Anna

drpatelh commented 4 years ago

Thanks @anvlasova . Just to clarify this is the change you had to make in order to get things working? https://github.com/nf-core/atacseq/pull/92/commits/cf6c5c39d890a3a69afa42b9316db02670b85f8b

anvlasova commented 4 years ago

hi @drpatelh , yes, it is, thank you! hope it will not break smth else. thanks, Anna

drpatelh commented 4 years ago

I hope not! I remember specifically adding in quote="" to fix another bug. Hopefully should be fine ;)

Fixed in https://github.com/nf-core/atacseq/pull/92