Closed cgroza closed 1 year ago
Hi,
The cause of the problem could be that some of the names in the transposable element database look like bad regular expressions (ex: [TE-755
), the grep
command tries to interpret regular expressions instead of seeing it as a fixed string. I could change this by adding the -F
option but the grep
command is used repeatedly in the pipeline so it is necessary to have some time to check the lines where the -F
option is useful to avoid unpleasant surprises.
Just to be sure, for Pacbio long reads.
Did you change the PRESET_OPTION minimap2 m̀at-ont
-> m̀ap-pb
in the config.yaml
file
...
PARAMS:
THREADS: 8 #number of threads for some task
OUTSIDER_VARIANT:
MINIMAP2:
PRESET_OPTION: 'map-pb'
...
Thanks for this report, Mourdas
Ok I checked for regular expression special characters and changed them to _. Indeed, it is a challenge to curate the TE database names for such cases. And yes I switched to map-pb. Will rerun and see how it goes!
Thanks, Cristian
Hi,
I am encountering another error, this time on a separate human dataset (not drosophila as in issue #5). It happens in rule MERGE_TE:
The human genome is HG002 with 30X Pacbio long reads. Any advice on what could be causing this?
My thanks, Cristian Groza