visanuwan / cresil

CReSIL: Accurate Identification of Extrachromosomal Circular DNA from Long-read Sequences
MIT License
6 stars 3 forks source link

CReSIL trim issue #8

Open SilviaDeng opened 1 year ago

SilviaDeng commented 1 year ago

Hi, when I ran cresol trim -t 4, it showed the following issue.

image

I installed the latest version in the GitHub.

Could you kindly help?

Thanks Silvia

SilviaDeng commented 1 year ago

It showed the same issue when I set the thread from 4 to 1.

visanuwan commented 1 year ago

@SilviaDeng It seems like there is an underscore character (_) in the header of your FASTA reference file. The quick fix is to replace those headers with other characters in your FASTA reference file and reindex it. i.e. >chr_yyy_1 to >chrYYY1. The proper fix will be in the next release of CReSIL.

BBBlanca commented 1 year ago

Hello @SilviaDeng,

Did you manage to solve the issue? I am having the same issue with trimming the data. If you have solved the issue, could you show how you did that? Thank you very much!!

Yue

BBBlanca commented 1 year ago

@SilviaDeng It seems like there is an underscore character (_) in the header of your FASTA reference file. The quick fix is to replace those headers with other characters in your FASTA reference file and reindex it. i.e. >chr_yyy_1 to >chrYYY1. The proper fix will be in the next release of CReSIL.

Hello, could you provide codes in terms of how to do that? Many thanks!! Yue

BBBlanca commented 1 year ago

Hi Cresil group,

Thank for you developing this useful tool. I removed all the underscores (_) from the headers of my fasta file, yet the same error still showed up. Could you help with this?

Thank you very much! I really like the way it looks of the final circular sequence visualization, so it would be great if I could run the codes successfully.

Thank you!

Yue

Haider-Hassan commented 5 months ago

Hello,

You can use this command to remove "_" from your reference fasta file headers:

sed 's/_//g' hg38.fa > hg38_underscoresremoved.fa

Where "hg.38.fa" is your original reference file, and "hg38underscoresremoved.fa" is the output fasta file with "" removed from your reference fasta file headers.