Closed anvlasova closed 4 years ago
Hi @anvlasova Thanks for reporting this :+1:
Would you mind uploading a file I can use to test this please? Alternatively, please feel free to submit a PR with a fix to the dev
branch. The pipeline is pretty much ready for the next release so it will be fixed in the next version.
Hi @drpatelh,
we found this problem with this annotation file ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.57_FB2014_03/gff/dmel-all-filtered-r5.57.gff.gz Some genes have a " ' " character in their names, i.e. gene "Name=beta'COP " on the chr 2L:13378902-13382382 and this causes the problem. i will try to contribute to the dev branch,
thanks, Anna
Thanks @anvlasova . Just to clarify this is the change you had to make in order to get things working? https://github.com/nf-core/atacseq/pull/92/commits/cf6c5c39d890a3a69afa42b9316db02670b85f8b
hi @drpatelh , yes, it is, thank you! hope it will not break smth else. thanks, Anna
I hope not! I remember specifically adding in quote=""
to fix another bug. Hopefully should be fine ;)
Hi,
Some annotations containing special characters in the gene names, for example in dm3 genome some genes have " ' '' symbol in the gene name.
Function 'read.table' in the script plot_homer_annotatepeaks.r can't properly read such files and the whole pipeline crashes in the 'merge_library_macs_qc' step. While function read.csv can process those files without problems. Guess this is because of default quote and comment char options that are different between these two functions: https://stackoverflow.com/questions/12828438/read-csv-vs-read-table
Can you please fix it?
thanks, Anna