HIT-ImmunologyLab / DBSCAN-SWA

29 stars 13 forks source link

Error in determining bac_sequence #8

Open peng-ye opened 2 years ago

peng-ye commented 2 years ago

Dear authors,

I found some IDs have no sequence in the resulting fna file. From what I saw, all those sequences should start at "0". I.e., the corresponding IDs look like xxxxx|0:\d+|DBSCAN-SWA (see below). I think there is sth wrong in determining the boundary for bac_sequence.

Another observation supporting this is that many sequences start with "[T|G|C]ATG", but not "ATG". It seems like the window should slide to the right by one base.

Would you please help check it out? Thanks.

Screen Shot 2022-05-13 at 00 16 44
gancao commented 2 years ago

I am sorry for the miss. I parsed protein locations using python package "Bio". The start location added 1 base automatically . Now I have updated dbscan-swa.py on https://github.com/gancao/DBSCAN-SWA-1

Thanks for your interest in DBSCAN-SWA. If you have any other questions, please comment on github