jambler24 / GenGraph

A repository for the GenGraph toolkit for the creation and manipulation of graph genomes
GNU General Public License v3.0
52 stars 16 forks source link

set mauve scratch path #9

Open jambler24 opened 5 years ago

jambler24 commented 5 years ago

--scratch-path-1 is hard-coded. This needs to become relative.

Devinaseeruttun commented 4 years ago

hi One question Why when I start running the program

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name Documents/output Conducting progressiveMauve progressiveMauve

It got stuck.

I am using a Mac Processor 2.7 GHz core intel core i7 Memory 16 GB Two sequences 4.5 MB each T hank you for your precious help Devina

jambler24 commented 4 years ago

Hi Devina,

Thank you for raising this issue, I'll be happy to take a look at it.

Firstly which version are you running? The latest that is in the repository?

Let's see if we can pin down the issue

Devinaseeruttun commented 4 years ago

Hi,

Yes I am using the latest version in the repository.
However since I am using a Mac I have install the alignment software using the following commands 1.# Install a MSA tool

Muscle

curl -fksSL http://drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86linux64.tar.gz | tar xz && \ mv muscle3.8.31_i86linux64 /usr/local/bin/muscle3.8.31_i86darwin64 curl -fksSL http://drive5.com/muscle/downloads3.8.31/muscle3.8.31_i86darwin64.tar.gz | tar xz && \ mv muscle3.8.31_i86darwin64 /usr/local/bin/muscle3.8.31_i86darwin64

  1. git clone https://github.com/jambler24/GenGraph https://github.com/jambler24/GenGraph
  2. Install MAUVE

    RUN curl -fksSL http://darlinglab.org/mauve/snapshots/2015/2015-02-13/linux-x64/mauve_linux_snapshot_2015-02-13.tar.gz | tar xz && \ cp mauve_snapshot_2015-02-13/linux-x64/progressiveMauve /usr/local

I have downloaded MAUVE and copy to the Applications folder on my Mac Then run the following command sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/

Thank you Cheers, Devina

On 11 Jun 2020, at 12:05, Jambler notifications@github.com wrote:

Hi Devina,

Thank you for raising this issue, I'll be happy to take a look at it.

Firstly which version are you running? The latest that is in the repository?

Let's see if we can pin down the issue

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-642483994, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTV7LFAQGVNTGYLX2DRWCF3PANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Thanks Devina,

Ok I'm going to do some testing add some checks in the code and push an update, hopefully by tonight.

I'll check back here when they are out :)

jambler24 commented 4 years ago

Hi Devina,

I have pushed some new code with a bit more detail in the log file.

In testing, it seems to be running but probably not finding progressiveMauve on your side for some reason.

I think the: sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/ part is the issue. Can you try run:

/usr/local/bin/progressiveMauve

and check that it is working?

Also pull the new code and have a look in the .log file, that should have some clues.

Devinaseeruttun commented 4 years ago

Thank you. I have run /usr/local/bin/progressiveMauve

(base) devina@Devinas-MacBook-Pro ~ % /usr/local/bin/progressiveMauve progressiveMauve usage:

When each genome resides in a separate file: /usr/local/bin/progressiveMauve [options] ...

When all genomes are in a single file: /usr/local/bin/progressiveMauve [options]

Options: --island-gap-size= Alignment gaps above this size in nucleotides are considered to be islands [20] --profile= (Not yet implemented) Read an existing sequence alignment in XMFA format and align it to other sequences or alignments --apply-backbone= Read an existing sequence alignment in XMFA format and apply backbone statistics to it --disable-backbone Disable backbone detection --mums Find MUMs only, do not attempt to determine locally collinear blocks (LCBs) --seed-weight= Use the specified seed weight for calculating initial anchors --output= Output file name. Prints to screen by default --backbone-output= Backbone output file name (optional). --match-input= Use specified match file instead of searching for matches --input-id-matrix= An identity matrix describing similarity among all pairs of input sequences/alignments --max-gapped-aligner-length= Maximum number of base pairs to attempt aligning with the gapped aligner --input-guide-tree= A phylogenetic guide tree in NEWICK format that describes the order in which sequences will be aligned --output-guide-tree= Write out the guide tree used for alignment to a file --version Display software version information --debug Run in debug mode (perform internal consistency checks--very slow) --scratch-path-1= Designate a path that can be used for temporary data storage. Two or more paths should be specified. --scratch-path-2= Designate a path that can be used for temporary data storage. Two or more paths should be specified. --collinear Assume that input sequences are collinear--they have no rearrangements --scoring-scheme=<ancestral|sp_ancestral|sp> Selects the anchoring score function. Default is extant sum-of-pairs (sp). --no-weight-scaling Don't scale LCB weights by conservation distance and breakpoint distance --max-breakpoint-distance-scale=<number [0,1]> Set the maximum weight scaling by breakpoint distance. Defaults to 0.5 --conservation-distance-scale=<number [0,1]> Scale conservation distances by this amount. Defaults to 0.5 --muscle-args= Additional command-line options for MUSCLE. Any quotes should be escaped with a backslas…. is working.

I refer to log file.

Thank you again Cheers Devina

On 11 Jun 2020, at 17:29, Jambler notifications@github.com wrote:

Hi Devina,

I have pushed some new code with a bit more detail in the log file.

In testing, it seems to be running but probably not finding progressiveMauve on your side for some reason.

I think the: sudo cp /Applications/Mauve.app/Contents/MacOS/progressiveMauve /usr/local/bin/ part is the issue. Can you try run:

/usr/local/bin/progressiveMauve

and check that it is working?

Also pull the new code and have a look in the .log file, that should have some clues.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-642650211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWU6BLNAYYLJDKUNB5DRWDL5HANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 4 years ago

hi what is the new code? I was testing GenGraph with your reference WGSs The log file INFO:root:({'seq0': 'H37Ra', 'seq1': 'F11'}, {'H37Ra': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'F11': 'Documents/genomes/F11/sequence.fasta/F11.fas'}, ['Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'Documents/genomes/F11/sequence.fasta/F11.fas'], {'H37Ra': 'NA', 'F11': 'NA'}) INFO:root:{'seq_name': 'H37Ra', 'aln_name': 'seq0', 'seq_path': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'annotation_path': 'NA'} INFO:root:{'seq_name': 'F11', 'aln_name': 'seq1', 'seq_path': 'Documents/genomes/F11/sequence.fasta/F11.fas', 'annotation_path': 'NA'} INFO:root:({'seq0': 'H37Ra', 'seq1': 'F11'}, {'H37Ra': 'Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'F11': 'Documents/genomes/F11/sequence.fasta/F11.fas'}, ['Documents/genomes/H37Ra/sequence.fasta/H37Ra.fas', 'Documents/genomes/F11/sequence.fasta/F11.fas'], {'H37Ra': 'NA', 'F11': 'NA'})

console system logun 11 19:24:02 Devinas-MacBook-Pro login[4869]: DEAD_PROCESS: 4869 ttys000 Jun 11 19:24:09 Devinas-MacBook-Pro login[60192]: USER_PROCESS: 60192 ttys000 Jun 11 19:24:40 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0100-0000-0000-000000000000[60191]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:06 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.AddressBook.abd): Service only ran for 1 seconds. Pushing respawn out by 9 seconds. Jun 11 19:25:31 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0200-0000-0000-000000000000[60209]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:53 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.AddressBook.abd): Service only ran for 1 seconds. Pushing respawn out by 9 seconds. Jun 11 19:25:59 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.10000000-0500-0000-0000-000000000000[60210]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:25:59 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0100-0000-0000-000000000000[60214]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:20 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0100-0000-0000-000000000000[60208]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:40 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.04000000-0300-0000-0000-000000000000[60218]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0D000000-0400-0000-0000-000000000000[60215]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0200-0000-0000-000000000000[60222]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:26:47 Devinas-MacBook-Pro com.apple.xpc.launchd[1] (com.apple.mdworker.shared.10000000-0600-0000-0000-000000000000[60223]): Service exited due to SIGKILL | sent by mds[133] Jun 11 19:27:35 Devinas-MacBook-Pro syncdefaultsd[60232]: objc[60232]: Class SYDClient is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be use

still not running. pl help. Thank you cheers

Devina

jambler24 commented 4 years ago

Aaah, ok the problem is the file extension I think.

Try change the files to .fa not .fas

Also maybe change the "sequence.fasta" directory to "sequence_fasta", as this can cause some confusion.

Devinaseeruttun commented 4 years ago

Thank you. I’ll make the requested changes. Now I have the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name genoutput Conducting progressiveMauve progressiveMauve Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 109, in genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict) File "/Users/devina/GenGraph/gengraph.py", line 1577, in bbone_to_initGraph iso_length = len(input_parser(input_dict[1][iso])[0]['DNA_seq']) TypeError: 'NoneType' object is not subscriptable (base) devina@Devinas-MacBook-Pro ~ %

Thank you for your precious help. Cheers

On 12 Jun 2020, at 12:33, Jambler notifications@github.com wrote:

Aaah, ok the problem is the file extension I think.

Try change the files to .fa not .fas

Also maybe change the "sequence.fasta" directory to "sequence_fasta", as this can cause some confusion.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643147842, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVS2KJSDGQN5FD2O5LRWHR4JANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

Devinaseeruttun commented 4 years ago

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler notifications@github.com wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 4 years ago

Hi, I have network 2.4 installed to you think that why I having the following problem AttributeError: 'MultiDiGraph' object has no attribute ‘node’.

Pease help.

Thank you Cheers Devina

On 12 Jun 2020, at 19:32, Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu wrote:

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler <notifications@github.com mailto:notifications@github.com> wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 4 years ago

Yes networkx 2.4 was the problem\Now I am stuck

The key error is ‘_leftend'

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete
Conducting local node realignment Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 138, in add_graph_data(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 2606, in add_graph_data if abs(int(data[an_isolate + '_leftend'])) == 1: KeyError: '_leftend'

Please help. Thank you Cheers

On 15 Jun 2020, at 20:11, Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu wrote:

Hi, I have network 2.4 installed to you think that why I having the following problem AttributeError: 'MultiDiGraph' object has no attribute ‘node’.

Pease help.

Thank you Cheers Devina

On 12 Jun 2020, at 19:32, Devina Bhookhun-Seeruttun <bhookhund@uom.ac.mu mailto:bhookhund@uom.ac.mu> wrote:

Done When I run it I got the following error

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 117, in refine_initGraph(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 1520, in refine_initGraph presorted_list.append((a_node, abs(a_graph.node[a_node][isolate + '_leftend']), abs(a_graph.node[a_node][isolate + '_rightend']))) AttributeError: 'MultiDiGraph' object has no attribute 'node' (base) devina@Devinas-MacBook-Pro ~ %

For your help please.

Thank you Cheers

On 12 Jun 2020, at 15:19, Jambler <notifications@github.com mailto:notifications@github.com> wrote:

Can you pull the latest version of the code by doing: git pull in the GenGraph directory, and then run it again, and have a look at the genoutput.log file?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-643218047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTRYIQUZ6JNITPXTGLRWIFMJANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi Devina,

Thanks for spotting the networkx 2.4 problem, I will add a check for that too.

Taking a look at the code to see what could be causing the new error and will get back to you asap.

pramesh-cfh11 commented 4 years ago

Hi, I installed your package on Ubuntu 20.04 LTS and I had to make a couple edits to the input_parses to get it to read. However, I'm stuck with an error message

File "/home/pradeep/py_vienna/lib/python3.8/site-packages/networkx/readwrite/graphml.py", line 466, in add_data raise nx.NetworkXError(msg % elementtype) networkx.exception.NetworkXError: GraphML writer does not support <class 'numpy.str'> as data values.

I'm not sure how to fix this, since i downgraded to networkx==2.3

Devinaseeruttun commented 4 years ago

Thank you. Devina

On 17 Jun 2020, at 14:11, Jambler notifications@github.com wrote:

Hi Devina,

Thanks for spotting the networkx 2.4 problem, I will add a check for that too.

Taking a look at the code to see what could be causing the new error and will get back to you asap.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-645283822, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWQ3XTEZHGSQFVUBU5TRXCJERANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi all,

I have been doing some testing with docker images, and it looks like networkx v2.4 is the problem. Downgrading to 2.3 works, and I am updating the docker image.

Not sure about the error you are seeing @pramesh-cfh11 , what edits were made to the input_parser?

pramesh-cfh11 commented 4 years ago

Hi @jambler24 First off, thanks for being so prompt about this. Your paper and related codebase is exactly what i've been looking for in my research. I'm enumerating the order of operations so you get a sense of changes i made after initially downloading the latest git repository and running it.

I am using a python3.8 virtual environment in a Ubuntu 20.04 OS.

1) First error: 'numpy' not found. Fix: I went through the gengraph.py file and saw that you had import numpy as np, but in the downstream block, you were still calling numpy.array. I removed the alias and set it to import numpy

2) Second error: indexing and parsing issue with input file - numpy only accepts integer slices Fix: I saw that your code accepts a csv file, so when i prepared a file with the four columns, the outputs of parse_seq_file didn't look correct, so i modified this function with hard-coded column numbers (see below) corresponding to the column names. I had to remove the header in the csv file, and then the code started to run, before returning the error that i originally contacted you with.

3) The code didn't run with networkx==2.4 and i got the same error as @Devinaseeruttun

def parse_seq_file(path_to_seq_file):

seq_file_dict = input_parser(path_to_seq_file)

A_seq_label_dict = {}
A_input_path_dict = {}
ordered_paths_list = []
anno_path_dict = {}

for a_seq_file in seq_file_dict:
    logging.info(a_seq_file)
    A_seq_label_dict[a_seq_file[1]] = a_seq_file[0]
    A_input_path_dict[a_seq_file[0]] = a_seq_file[2]
    ordered_paths_list.append(a_seq_file[2])
    anno_path_dict[a_seq_file[0]] = a_seq_file[3]

return A_seq_label_dict, A_input_path_dict, ordered_paths_list, anno_path_dict 
jambler24 commented 4 years ago

Just an update, the code now supports networkx 2.4.

The code works in testing, so I will have to test with Python 3.8 in a container and see if there is something going wrong there.

Devinaseeruttun commented 4 years ago

Great.Thank you Cheers Devina

On 23 Jun 2020, at 13:15, Jambler notifications@github.com wrote:

Just an update, the code now supports networkx 2.4.

The code works in testing, so I will have to test with Python 3.8 in a container and see if there is something going wrong there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/9#issuecomment-648017193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWTA7CNKLWXATLYAXLTRYBXCHANCNFSM4HQKWEYQ.

-- http://www.uom.ac.mu/index.php/email-disclaimer

pramesh-cfh11 commented 4 years ago

Awesoeme, thanks @jambler24 - awaiting your testing.