Open eKariuki-sleepy opened 3 years ago
Hello,
In the screenshot head section you shared of the subsys_db.fa database, it looks like there may be an erroneous newline that is repeated in the entries. If you look at lines 5 and 6, there seems to be a newline (return) between "Tricarballylate_Utilization" and "Carbohydrates".
Can you check to see if those are on the same line, separated by tabs? It should look like:
>fig|1085.1.peg.628 TcuB: works with TcuA to oxidize tricarballylate to cis-aconitate Tricarballylate_Utilization Carbohydrates Organic acids cmr|NT02RR3166,gb|AAN75034.1,gb|ABC23782.1,gi|25989730,gi|83577231,gi|83594317,gnl|md5|4a7b7dbab4b1cb4f1eb179ee,img|637827101,kegg|rru:Rru_A2987,ref|YP_428069.1,tr|Q2RQ13,tr|Q8GDD2
MFDPCDLPPPPAPAPGASAAEAEARRVLALCTVCGYCTGLCDVFRAAERRPALTSGDLGHLAHLCHGCQACWHACQYTPPHVFAIVVPATLARVRAESYARHAWPRPLKGPAVLALALAATLVVPLLTVLLVPSQDLFAANAAPGAFYGVIPWGVMTPIALLTLGWAALAVGLGVARFWREGAQGPPAAPLARVWGRALADIVSLRNLKGGGRGCFETDDRPSHRRRWLHHALAGGFLLCLGSTLAATVYHHGLGREAPYPLTSLPVLLGLVGGCLMVGGASGLAWLKRHADPEPQAAETLGADRCLLAMLIAVALSGLVLLALRDTAAMGLLLALHLGTVLGFFITLPYGKFVHGAYRAAALLRSAAERRTDPRAPLAERPGVDRDLP
This might be what's occurring. Did you make any changes to the subsys_db, or when did you download it?
I did not make any changes to the database. I have used a better text viewer (here)and it appears to be in the same format as the one you shared above.
Thank you.
Got it, thanks. I just re-downloaded using the command:
wget "https://zenodo.org/record/5022377/files/subsys_db.fa.bz2" --no-check-certificate
bunzip2 subsys_db.fa.bz2
and I see the proper lines (no extra line breaks). Can you try re-downloading this database with the link here and see if you still get the same error?
Thank you. Let me do so and give you feedback.
On Wed, Jul 14, 2021 at 9:14 PM Sam Westreich @.***> wrote:
Got it, thanks. I just re-downloaded using the command:
wget "https://zenodo.org/record/5022377/files/subsys_db.fa.bz2" --no-check-certificate bunzip2 subsys_db.fa.bz2
and I see the proper lines (no extra line breaks). Can you try re-downloading this database with the link here and see if you still get the same error?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/transcript/samsa2/issues/65#issuecomment-880107039, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANFWMMSJA6I6CDJSRLR4DCLTXXHZVANCNFSM5ALZIB4A .
Hello Sam,
I downloaded the file, run it, and I am experiencing the same error.
I thought it might be something with python 2 since I am working on a HPC,
with py2 installed in a conda environment. I tried py3 also, and the error
persists. Also, immediately after I run the DIAMOND_subsystems_analysis_counter.py
script, it deletes itself and I have to copy/redownload it, something I have never experienced before. Could this be related to why it is not
working?
Thanks.
@.***
On Thu, Jul 15, 2021 at 9:11 AM
@.***
On Wed, Jul 14, 2021 at 9:14 PM Sam Westreich @.***> wrote:
Got it, thanks. I just re-downloaded using the command:
wget "https://zenodo.org/record/5022377/files/subsys_db.fa.bz2" --no-check-certificate bunzip2 subsys_db.fa.bz2
and I see the proper lines (no extra line breaks). Can you try re-downloading this database with the link here and see if you still get the same error?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/transcript/samsa2/issues/65#issuecomment-880107039, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANFWMMSJA6I6CDJSRLR4DCLTXXHZVANCNFSM5ALZIB4A .
I am getting the same error at the top of this thread also at line 133. Though I do not have the issue with the script getting deleted.
@cbeekman Are you also using the default downloaded Subsystems database with no modifications?
One quick test: if you modify line 133 of DIAMOND_subsystems_analysis_counter.py
to be:
if "NO HIERARCHY" in splitline:
Do you still get the same error?
Hi,
Thanks for quick reply. You can disregard my post I since realized I made a mistake. The issue was that I was directing the script to the diamond indexed version instead of the fasta file version of the database.
Thanks, Chapman
Okay, great, glad to hear!
Hi, I am trying to run the python script
DIAMOND_subsystems_analysis_counter.py
and I keep getting this error:I cannot seem to figure it out, unfortunately. All I understand is that we are indexing the second field and splitting the field using sep
"\t"
, with the exception of line 1 that hasNO HIERARCHY
. Also, find the head section to my subsys_db.fa database here. Might there be a problem with my database?Kindly assist.