Error in detecting names for columns in Roary file (solved) + TypeError: binom_test() ...

ADGM commented 8 years ago

Hi, This is my first time to use Scoary. I'm on version 1.3.3. I seem to be having trouble proceeding with the run because of the Roary file? I didn't change anything with the gene_presence_absence.csv, it's just as Roary produced it.

Here is the error it throws:

Warning: Could not properly detect the correct names for all columns in the ROARY table. Traceback (most recent call last): File "scoary.py", line 22, in methods.main() File "/home/adgm/Scoary/scoary/methods.py", line 117, in main allowed_isolates=allowed_isolates) File "/home/adgm/Scoary/scoary/methods.py", line 218, in Csv_to_dic_Roary r[q[genecol]] = {"Non-unique Gene name": q[nugcol], "Annotation": q[anncol]} if roaryfile else {} IndexError: list index out of range

andrewjpage commented 8 years ago

Hi, What version of Roary did you use? After Oct 2015, the format of the spreadsheet changed.

On 14 June 2016 at 10:48, ADGM notifications@github.com wrote:

Hi, This is my first time to use Scoary. I'm on version 1.3.3. I seem to be having trouble proceeding with the run because of the Roary file? I didn't change anything with the gene_presence_absence.csv, it's just as Roary produced it.

Here is the error it throws:

Warning: Could not properly detect the correct names for all columns in the ROARY table. Traceback (most recent call last): File "scoary.py", line 22, in methods.main() File "/home/adgm/Scoary/scoary/methods.py", line 117, in main allowed_isolates=allowed_isolates) File "/home/adgm/Scoary/scoary/methods.py", line 218, in Csv_to_dic_Roary r[q[genecol]] = {"Non-unique Gene name": q[nugcol], "Annotation": q[anncol]} if roaryfile else {} IndexError: list index out of range

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AdmiralenOla/Scoary/issues/23, or mute the thread https://github.com/notifications/unsubscribe/AABeV2nd0K3lbYMibeCVsikMySjzrt1bks5qLnj6gaJpZM4I1JK1 .

ADGM commented 8 years ago

I'm basing it on the date I produced the file, it must have been Roary 3.5.9

AdmiralenOla commented 8 years ago

Hi, and thanks for reporting! It seems Scoary can't identify the correct columns in your Roary output. Not sure why you're getting an error. I'm not getting any conflict issues with Roary 3.4.4 or 3.6.2, so I guess 3.5.9 should be okay as well. Would you be willing to send you gene_presence_absence.csv file to olbb@fhi.no, so I can test? (Just the first few lines would probably be enough if you are uncomfortable with sending your data.)

ADGM commented 8 years ago

Hi, So I did change something about the csv file- the delimiter, which I changed to a semicolon (as was recommended to prevent issues with the annotation). I reverted back to a comma delimiter for both Roary (I got rid of the commas in the annotation column to prevent possible problems) and traits file and the run went along.

Unfortunately, I ran into another error:

Calculating max number of contrasting pairs for each significant gene 0.00%Traceback (most recent call last): File "/home/adgm/Scoary/scoary.py", line 22, in methods.main() File "/home/adgm/Scoary/scoary/methods.py", line 141, in main args.correction, upgmatree, GTC) File "/home/adgm/Scoary/scoary/methods.py", line 418, in StoreResults StoreTraitResult(Results[Trait], Trait, max_hits, p_cutoff, correctionmethod, upgmatree, GTC) File "/home/adgm/Scoary/scoary/methods.py", line 458, in StoreTraitResult alternative="greater") TypeError: binom_test() got an unexpected keyword argument 'alternative'

AdmiralenOla commented 8 years ago

Hi again,

I'll reword the delimiter part as it's a bit unclear right now. The other error you're getting is from using an older version of SciPy. I think the "alternative" argument (for one-sided binomial tests) only exists in 0.17+.

AdmiralenOla commented 8 years ago

If you don't want / can't upgrade you can manually edit the script. Scroll down to the following lines in methods.py:

        best_pairwise_comparison_p = ss.binom_test(max_propairs,
                                                   max_total_pairs,
                                                   0.5,
                                                   alternative="greater")
        worst_pairwise_comparison_p = ss.binom_test(max_total_pairs-max_antipairs,
                                                    max_total_pairs,
                                                    0.5,
                                                    alternative="greater")

Change this to:

        best_pairwise_comparison_p = ss.binom_test(max_propairs,
                                                   max_total_pairs,
                                                   0.5) / 2
        worst_pairwise_comparison_p = ss.binom_test(max_total_pairs-max_antipairs,
                                                    max_total_pairs,
                                                    0.5) / 2

This will work because the binomial distribution is symmetric when p=0.5. I'll change this in the next version of Scoary so that older versions of scipy will work as well.

ADGM commented 8 years ago

Ah, I did get to upgrade python-scipy to 0.17.1. It works smoothly now, thank you! Useful data to have :+1:

AdmiralenOla / Scoary

Error in detecting names for columns in Roary file (solved) + TypeError: binom_test() ... #23