Closed ADGM closed 8 years ago
Hi, What version of Roary did you use? After Oct 2015, the format of the spreadsheet changed.
On 14 June 2016 at 10:48, ADGM notifications@github.com wrote:
Hi, This is my first time to use Scoary. I'm on version 1.3.3. I seem to be having trouble proceeding with the run because of the Roary file? I didn't change anything with the gene_presence_absence.csv, it's just as Roary produced it.
Here is the error it throws:
Warning: Could not properly detect the correct names for all columns in the ROARY table. Traceback (most recent call last): File "scoary.py", line 22, in methods.main() File "/home/adgm/Scoary/scoary/methods.py", line 117, in main allowed_isolates=allowed_isolates) File "/home/adgm/Scoary/scoary/methods.py", line 218, in Csv_to_dic_Roary r[q[genecol]] = {"Non-unique Gene name": q[nugcol], "Annotation": q[anncol]} if roaryfile else {} IndexError: list index out of range
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AdmiralenOla/Scoary/issues/23, or mute the thread https://github.com/notifications/unsubscribe/AABeV2nd0K3lbYMibeCVsikMySjzrt1bks5qLnj6gaJpZM4I1JK1 .
I'm basing it on the date I produced the file, it must have been Roary 3.5.9
Hi, and thanks for reporting! It seems Scoary can't identify the correct columns in your Roary output. Not sure why you're getting an error. I'm not getting any conflict issues with Roary 3.4.4 or 3.6.2, so I guess 3.5.9 should be okay as well. Would you be willing to send you gene_presence_absence.csv file to olbb@fhi.no, so I can test? (Just the first few lines would probably be enough if you are uncomfortable with sending your data.)
Hi, So I did change something about the csv file- the delimiter, which I changed to a semicolon (as was recommended to prevent issues with the annotation). I reverted back to a comma delimiter for both Roary (I got rid of the commas in the annotation column to prevent possible problems) and traits file and the run went along.
Unfortunately, I ran into another error:
Calculating max number of contrasting pairs for each significant gene
0.00%Traceback (most recent call last):
File "/home/adgm/Scoary/scoary.py", line 22, in
Hi again,
I'll reword the delimiter part as it's a bit unclear right now. The other error you're getting is from using an older version of SciPy. I think the "alternative" argument (for one-sided binomial tests) only exists in 0.17+.
If you don't want / can't upgrade you can manually edit the script. Scroll down to the following lines in methods.py:
best_pairwise_comparison_p = ss.binom_test(max_propairs,
max_total_pairs,
0.5,
alternative="greater")
worst_pairwise_comparison_p = ss.binom_test(max_total_pairs-max_antipairs,
max_total_pairs,
0.5,
alternative="greater")
Change this to:
best_pairwise_comparison_p = ss.binom_test(max_propairs,
max_total_pairs,
0.5) / 2
worst_pairwise_comparison_p = ss.binom_test(max_total_pairs-max_antipairs,
max_total_pairs,
0.5) / 2
This will work because the binomial distribution is symmetric when p=0.5. I'll change this in the next version of Scoary so that older versions of scipy will work as well.
Ah, I did get to upgrade python-scipy to 0.17.1. It works smoothly now, thank you! Useful data to have :+1:
Hi, This is my first time to use Scoary. I'm on version 1.3.3. I seem to be having trouble proceeding with the run because of the Roary file? I didn't change anything with the gene_presence_absence.csv, it's just as Roary produced it.
Here is the error it throws:
Warning: Could not properly detect the correct names for all columns in the ROARY table. Traceback (most recent call last): File "scoary.py", line 22, in
methods.main()
File "/home/adgm/Scoary/scoary/methods.py", line 117, in main
allowed_isolates=allowed_isolates)
File "/home/adgm/Scoary/scoary/methods.py", line 218, in Csv_to_dic_Roary
r[q[genecol]] = {"Non-unique Gene name": q[nugcol], "Annotation": q[anncol]} if roaryfile else {}
IndexError: list index out of range