Gengraph running problem

jambler24 / GenGraph

A repository for the GenGraph toolkit for the creation and manipulation of graph genomes

GNU General Public License v3.0

51 stars 16 forks source link

Gengraph running problem #16

Open Devinaseeruttun opened 4 years ago

Devinaseeruttun commented 4 years ago

hi One question Why when I start running the program

(base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name Documents/output Conducting progressiveMauve progressiveMauve

It got stuck.

I am using a Mac Processor 2.7 GHz core intel core i7 Memory 16 GB Two sequences 4.5 MB each Thank you for your precious help Devina

jambler24 commented 4 years ago

Hi @Devinaseeruttun ,

How large are the genomes you are using?

I have updated the code to include more error reporting. Please pull the latest version as the output may help with troubleshooting.

Devinaseeruttun commented 4 years ago

hi Size of one genome is around 24 mb I am running 4 genomes. Ok I pull the latest version and run it again. Check the output for troubleshooting. Thank you

What I think and what I say are in harmony with what I do. Sent from my iPad

On 22 Jun 2020, at 15:50, Jambler notifications@github.com wrote:

Hi @Devinaseeruttun https://github.com/Devinaseeruttun ,

How large are the genomes you are using?

I have updated the code to include more error reporting. Please pull the latest version as the output may help with troubleshooting.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-647467378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVR77IGIDFCRGNR4UTRX5AQBANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 4 years ago

Hi Same problem (base) devina@Devinas-MacBook-Pro ~ % python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/anagengraph.txt --out_file_name output Running GenGraph Toolkit Creating genome graph Conducting progressiveMauve progressiveMauve Complete Conducting local node realignment Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 152, in add_graph_data(genome_aln_graph) File "/Users/devina/GenGraph/gengraph.py", line 2606, in add_graph_data if abs(int(data[an_isolate + '_leftend'])) == 1: KeyError: ‘_leftend'

I am attaching the output.log file

Thank you

On 22 Jun 2020, at 17:33, Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu wrote:

hi Size of one genome is around 24 mb I am running 4 genomes. Ok I pull the latest version and run it again. Check the output for troubleshooting. Thank you

What I think and what I say are in harmony with what I do. Sent from my iPad

On 22 Jun 2020, at 15:50, Jambler <notifications@github.com mailto:notifications@github.com> wrote:

Hi @Devinaseeruttun https://github.com/Devinaseeruttun ,

How large are the genomes you are using?

I have updated the code to include more error reporting. Please pull the latest version as the output may help with troubleshooting.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-647467378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVR77IGIDFCRGNR4UTRX5AQBANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi @Devinaseeruttun ,

I can't see the output.log file attachment, could you please send the last few lines?

Also if possible the anagengraph.txt contents

I have updated the code to support networkx 2.4, and it seems to be working in testing. I still need to update the pip version, but just want to do some more testing first.

Devinaseeruttun commented 4 years ago

Hi Thank you The last few lines of the output files

INFO:root:{'H37Rv_leftend': 3730816, 'H37Rv_rightend': 3730922, 'ids': 'H37Rv', 'name': 'Aln_368'} INFO:root:Aln_369 INFO:root:{'H37Rv_leftend': 3731052, 'H37Rv_rightend': 3732706, 'ids': 'H37Rv', 'name': 'Aln_369'} INFO:root:Aln_370 INFO:root:{'H37Rv_leftend': 3732760, 'H37Rv_rightend': 3732829, 'ids': 'H37Rv', 'name': 'Aln_370'} INFO:root:Aln_371 INFO:root:{'H37Rv_leftend': 3937855, 'H37Rv_rightend': 3938140, 'ids': 'H37Rv', 'name': 'Aln_371'} INFO:root:Aln_372 INFO:root:{'H37Rv_leftend': 3942504, 'H37Rv_rightend': 3942527, 'ids': 'H37Rv', 'name': 'Aln_372'} INFO:root:Aln_373 INFO:root:{'H37Rv_leftend': 3942705, 'H37Rv_rightend': 3942826, 'ids': 'H37Rv', 'name': 'Aln_373'} INFO:root:Aln_374 INFO:root:{'H37Rv_leftend': 3948150, 'H37Rv_rightend': 3948171, 'ids': 'H37Rv', 'name': 'Aln_374'} INFO:root:Aln_375 INFO:root:{'ids': '', 'name': 'Aln_375’}

Attached both the output log and anagengraph.txt files.

Cheers

Devina

On 23 Jun 2020, at 13:10, Jambler notifications@github.com wrote:

Hi @Devinaseeruttun https://github.com/Devinaseeruttun ,

I can't see the output.log file attachment, could you please send the last few lines?

Also if possible the anagengraph.txt contents

I have updated the code to support networkx 2.4, and it seems to be working in testing. I still need to update the pip version, but just want to do some more testing first.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-648014943, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVBMW7MIZA7DVIR4VTRYBWSBANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

seq_name aln_name seq_path annotation_path F11 seq1 Documents/gene/F11/F11.fa NA H37Ra seq2 Documents/gene/H37Ra/H37Ra.fa NA H37Rv seq3 Documents/gene/H37Rv/H37Rv.fa NA Beijing_NITR203 seq4 Documents/gene/Beijing_NITR203/Beijing_NITR203.fa NA

jambler24 commented 4 years ago

Ah, can I ask as a test for you to remove the underscore ( _ ) in Beijing_NITR203?

And start the seq count from 0, so seq0, seq1, seq2, seq3?

Devinaseeruttun commented 4 years ago

It’s running

Thank you

Cheers

From: Jambler [mailto:notifications@github.com] Sent: Tuesday, June 23, 2020 1:36 PM To: jambler24/GenGraph GenGraph@noreply.github.com Cc: Devinaseeruttun <**>; Mention < mention@noreply.github.com> Subject: Re: [jambler24/GenGraph] Gengraph running problem (#16)

Ah, can I ask as a test for you to remove the underscore ( _ ) in Beijing_NITR203?

And start the seq count from 0, so seq0, seq1, seq2, seq3?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-648028105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWQLZCTT2NRBGEOPSOLRYBZPVANCNFSM4N3AMRKA .

-- <***>

jambler24 commented 4 years ago

No problem,

I'll look into fixing this, it is an old requirement from early versions that I will have to patch.

Cheers!

Devinaseeruttun commented 4 years ago

Hi,

I need your help. I am working with a very large size file above 200mb of 20 WGS. I am running the GenGraph using a cluster. First how do I estimate the memory requirement Second how to I call the progressiveMauve Mauve is available and the path is /opt/exp_soft/bioinf/Mauve I have added this export PATH=$PATH:/opt/exp_soft/bioinf/Mauve/progressiveMauve/bin/ in my .bash_profile and Module load bioinformatics

When I run the the GenGraph tool I having the following issues

[bhookund@n22 ~]$ python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Gengraph/Stec20/stec20.txt --out_file_name output Running GenGraph Toolkit Creating genome graph [OrderedDict([('seq_name', 'stec13'), ('aln_name', 'seq0'), ('seq_path', 'Gengraph/Stec20/stec13/stec13.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec18'), ('aln_name', 'seq1'), ('seq_path', 'Gengraph/Stec20/stec18/stec18.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec24'), ('aln_name', 'seq2'), ('seq_path', 'Gengraph/Stec20/stec24/stec24.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec27'), ('aln_name', 'seq3'), ('seq_path', 'Gengraph/Stec20/stec27/stec27.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec69'), ('aln_name', 'seq4'), ('seq_path', 'Gengraph/Stec20/stec69/stec69.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec76'), ('aln_name', 'seq5'), ('seq_path', 'Gengraph/Stec20/stec76/stec76.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec77'), ('aln_name', 'seq6'), ('seq_path', 'Gengraph/Stec20/stec77/stec77.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec78'), ('aln_name', 'seq7'), ('seq_path', 'Gengraph/Stec20/stec78/stec78.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec83'), ('aln_name', 'seq8'), ('seq_path', 'Gengraph/Stec20/stec83/stec83.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec87'), ('aln_name', 'seq9'), ('seq_path', 'Gengraph/Stec20/stec87/stec87.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec89'), ('aln_name', 'seq10'), ('seq_path', 'Gengraph/Stec20/stec89/stec89.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec91'), ('aln_name', 'seq11'), ('seq_path', 'Gengraph/Stec20/stec91/stec91.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec92'), ('aln_name', 'seq12'), ('seq_path', 'Gengraph/Stec20/stec92/stec92.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec97'), ('aln_name', 'seq13'), ('seq_path', 'Gengraph/Stec20/stec97/stec97.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec98'), ('aln_name', 'seq14'), ('seq_path', 'Gengraph/Stec20/stec98/stec98.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec106'), ('aln_name', 'seq15'), ('seq_path', 'Gengraph/Stec20/stec106/stec106.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec107'), ('aln_name', 'seq16'), ('seq_path', 'Gengraph/Stec20/stec107/stec107.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec108'), ('aln_name', 'seq17'), ('seq_path', 'Gengraph/Stec20/stec108/stec108.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec128'), ('aln_name', 'seq18'), ('seq_path', 'Gengraph/Stec20/stec128/stec128.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec141'), ('aln_name', 'seq19'), ('seq_path', 'Gengraph/Stec20/stec141/stec141.fa'), ('annotation_path', 'NA')])] Conducting progressiveMauve progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 124, in genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict) File "/home/bhookund/GenGraph/gengraph.py", line 1616, in bbone_to_initGraph backbone_lol = input_parser(bbone_file) File "/home/bhookund/GenGraph/gengraph.py", line 1189, in input_parser in_file = open(file_path, 'r') FileNotFoundError: [Errno 2] No such file or directory: ‘globalAlignment_output.backbone'

Cannot call progressiveMauve INFO:root:progressiveMauve ERROR:root:progressiveMauve_call error INFO:root:progressiveMauve Complete INFO:root:Running bbone_to_initGraph

Thank you. Cheers, Devina

On 23 Jun 2020, at 17:51, Jambler notifications@github.com wrote:

No problem,

I'll look into fixing this, it is an old requirement from early versions that I will have to patch.

Cheers!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-648162375, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVWPG3UOCQMKYCUDMTRYCXNFANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi Devina,

I will take a look at this today and get back to you asap.

Can you try run:

python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Gengraph/Stec20/stec20.txt --out_file_name output --progressiveMauve_path /opt/exp_soft/bioinf/Mauve/progressiveMauve/bin/progressiveMauve

Where you specify where the progressiveMauve binary is?

The other idea is specifying the full path to your sequence files in stec20.txt, I think those are relative paths you specified.

Let me know if this helps? I will go over the output again and try spot anything else.

Thank you for posting this, it helps improve the tool to better give output when things go wrong and catch unexpected scenarios.

Devinaseeruttun commented 4 years ago

Thank you for your mail.

I have modify my export path in the cluster .bash_profile as follows and It’s working. export PATH=/opt/exp_soft/bioinf/mauve/linux-x64/:$PATH

I have specified the full path for the stec20.txt.

Any idea the memory requirement for this run. Thank you again Cheers, Devina

On 3 Sep 2020, at 12:49, Jambler notifications@github.com wrote:

Hi Devina,

I will take a look at this today and get back to you asap.

Can you try run:

python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Gengraph/Stec20/stec20.txt --out_file_name output --progressiveMauve_path /opt/exp_soft/bioinf/Mauve/progressiveMauve/bin/progressiveMauve

Where you specify where the progressiveMauve binary is?

The other idea is specifying the full path to your sequence files in stec20.txt, I think those are relative paths you specified.

Let me know if this helps? I will go over the output again and try spot anything else.

Thank you for posting this, it helps improve the tool to better give output when things go wrong and catch unexpected scenarios.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-686348900, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWVSFYGMI6VD3WS5YJ3SD5J75ANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

I'll see if I can put in better checks for if the paths are correct in the next patch.

As for the memory usage, it is always difficult to estimate as is depends on a lot of variables, including the number of genomes, their size, and how similar they are. Try 16gb, and let me know if you run into trouble?

Devinaseeruttun commented 4 years ago

Hi Thanks. My file size is 5.4 x 20 MB I am having the following problem now Caught Signal 11

Please advise Thank you [bhookund@n14 ~]$ python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Gengraph/Stec20/stec20.txt --out_file_name output

Running GenGraph Toolkit

Creating genome graph

[OrderedDict([('seq_name', 'stec13'), ('aln_name', 'seq0'), ('seq_path', 'Gengraph/Stec20/stec13/stec13.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec18'), ('aln_name', 'seq1'), ('seq_path', 'Gengraph/Stec20/stec18/stec18.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec24'), ('aln_name', 'seq2'), ('seq_path', 'Gengraph/Stec20/stec24/stec24.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec27'), ('aln_name', 'seq3'), ('seq_path', 'Gengraph/Stec20/stec27/stec27.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec69'), ('aln_name', 'seq4'), ('seq_path', 'Gengraph/Stec20/stec69/stec69.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec76'), ('aln_name', 'seq5'), ('seq_path', 'Gengraph/Stec20/stec76/stec76.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec77'), ('aln_name', 'seq6'), ('seq_path', 'Gengraph/Stec20/stec77/stec77.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec78'), ('aln_name', 'seq7'), ('seq_path', 'Gengraph/Stec20/stec78/stec78.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec83'), ('aln_name', 'seq8'), ('seq_path', 'Gengraph/Stec20/stec83/stec83.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec87'), ('aln_name', 'seq9'), ('seq_path', 'Gengraph/Stec20/stec87/stec87.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec89'), ('aln_name', 'seq10'), ('seq_path', 'Gengraph/Stec20/stec89/stec89.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec91'), ('aln_name', 'seq11'), ('seq_path', 'Gengraph/Stec20/stec91/stec91.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec92'), ('aln_name', 'seq12'), ('seq_path', 'Gengraph/Stec20/stec92/stec92.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec97'), ('aln_name', 'seq13'), ('seq_path', 'Gengraph/Stec20/stec97/stec97.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec98'), ('aln_name', 'seq14'), ('seq_path', 'Gengraph/Stec20/stec98/stec98.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec106'), ('aln_name', 'seq15'), ('seq_path', 'Gengraph/Stec20/stec106/stec106.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec107'), ('aln_name', 'seq16'),('seq_path', 'Gengraph/Stec20/stec107/stec107.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec108'), ('aln_name', 'seq17'), ('seq_path', 'Gengraph/Stec20/stec108/stec108.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec128'), ('aln_name', 'seq18'), ('seq_path', 'Gengraph/Stec20/stec128/stec128.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec141'), ('aln_name', 'seq19'), ('seq_path', 'Gengraph/Stec20/stec141/stec141.fa'), ('annotation_path', 'NA')])]

Conducting progressiveMauve

Caught signal 11

Cleaning up and exiting!

Temporary files deleted.

progressiveMauve Complete

Traceback (most recent call last):

File "GenGraph/gengraphTool.py", line 124, in

genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict)

File "/home/bhookund/GenGraph/gengraph.py", line 1616, in bbone_to_initGraph

backbone_lol = input_parser(bbone_file)

File "/home/bhookund/GenGraph/gengraph.py", line 1189, in input_parser

in_file = open(file_path, 'r')

FileNotFoundError: [Errno 2] No such file or directory: 'globalAlignment_output.backbone'

On 3 Sep 2020, at 18:13, Jambler notifications@github.com wrote:

I'll see if I can put in better checks for if the paths are correct in the next patch.

As for the memory usage, it is always difficult to estimate as is depends on a lot of variables, including the number of genomes, their size, and how similar they are. Try 16gb, and let me know if you run into trouble?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-686520451, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWXZEZDC7Q7SMTOU5I3SD6P65ANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi Devina,

The Caught signal 11 error is a segmentation fault normally associated with memory issues.

It is tricky for me to debug from here, but I am thinking about it. It looks like Mauve fails, but the program keeps running and crashes when the output to Mauve is found to be missing.

The problem is probably with setting the scratch path for Mauve:

https://sourceforge.net/p/mauve/mailman/message/35408167/

I'll get back with a solution soonest.

Devinaseeruttun commented 4 years ago

Thank you for your reply Can I increase the memory size? Will it help?

Cheers Devina

On Mon, 7 Sep 2020, 13:09 Jambler notifications@github.com wrote:

Hi Devina,

The Caught signal 11 error is a segmentation fault normally associated with memory issues.

It is tricky for me to debug from here, but I am thinking about it. It looks like Mauve fails, but the program keeps running and crashes when the output to Mauve is found to be missing.

The problem is probably with setting the scratch path for Mauve:

https://sourceforge.net/p/mauve/mailman/message/35408167/

I'll get back with a solution soonest.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-688185326, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWXXZDX4SNQCMJHQCCDSESPNNANCNFSM4N3AMRKA .

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Possibly, it is hard to debug something happening on a cluster I am not familiar with!

Another option is running progressiveMauve separately, and then providing the .bbone file to gengraph with the --backbone_file flag

Devinaseeruttun commented 4 years ago

Hi, I have run progressiveMauve on my PC.[When I am trying to run the GenGraph providing the b.bone file I am having the following error

bhookund@n07 ~]$ python3 GenGraph/gengraphTool.py make_genome_graph --backbone_file GenGraph/globalAlignment_output6.backbone --out_file_name outstec20 Running GenGraph Toolkit Creating genome graph Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 92, in parsed_input_dict = parse_seq_file(args.seq_file) File "/home/bhookund/GenGraph/gengraph.py", line 1317, in parse_seq_file seq_file_dict = input_parser(path_to_seq_file) File "/home/bhookund/GenGraph/gengraph.py", line 1081, in input_parser if file_path[-3:] == ".fa" or file_path[-6:] == ".fasta": TypeError: 'NoneType' object is not subscriptable [bhookund@n07 ~]$

Please help. Thank you\

Cheers

On 7 Sep 2020, at 17:31, Jambler notifications@github.com wrote:

Possibly, it is hard to debug something happening on a cluster I am not familiar with!

Another option is running progressiveMauve separately, and then providing the .bbone file to gengraph with the --backbone_file flag

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-688327766, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWSIWDTEWRG7ROU5CNLSETOBJANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 4 years ago

Hi Devina,

I have put out a new version of the code that allows you to specify scratch paths which should help with the cluster memory problem.

From your output above, it looks like you didn't specify the sequences you wanted to align with the --seq_file flag. This should be mandatory, I'll update that in the code.

Devinaseeruttun commented 4 years ago

Hi, Thank you It is still not working Got the same Caught signal 11 error..

...Stec20/stec106/stec106.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec107'), ('aln_name', 'seq16'), ('seq_path', 'Gengraph/Stec20/stec107/stec107.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec108'), ('aln_name', 'seq17'), ('seq_path', 'Gengraph/Stec20/stec108/stec108.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec128'), ('aln_name', 'seq18'), ('seq_path', 'Gengraph/Stec20/stec128/stec128.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec141'), ('aln_name', 'seq19'), ('seq_path', 'Gengraph/Stec20/stec141/stec141.fa'), ('annotation_path', 'NA')])] Conducting progressiveMauve Caught signal 11 Cleaning up and exiting! Temporary files deleted. progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 124, in genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict) File "/home/bhookund/GenGraph/gengraph.py", line 1616, in bbone_to_initGraph backbone_lol = input_parser(bbone_file) File "/home/bhookund/GenGraph/gengraph.py", line 1189, in input_parser in_file = open(file_path, 'r') FileNotFoundError: [Errno 2] No such file or directory: 'globalAlignment_stec.backbone' [bhookund@n29 ~]$

Still struggling with it. Cheers, Devina

On 14 Sep 2020, at 15:19, Jambler notifications@github.com wrote:

Hi Devina,

I have put out a new version of the code that allows you to specify scratch paths which should help with the cluster memory problem.

From your output above, it looks like you didn't specify the sequences you wanted to align with the --seq_file flag. This should be mandatory, I'll update that in the code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-691988563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWWWW5HKSPZ3JYOAAJLSFX345ANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 4 years ago

Hi, I have running Gengraph on the cluster providing the .bbone file to gengraph with the --backbone_file flag.

The problem is network2.5. Please help. Thank you.

Cheers, Devina

On 16 Sep 2020, at 17:17, Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu wrote:

Hi, Thank you It is still not working Got the same Caught signal 11 error..

...Stec20/stec106/stec106.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec107'), ('aln_name', 'seq16'), ('seq_path', 'Gengraph/Stec20/stec107/stec107.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec108'), ('aln_name', 'seq17'), ('seq_path', 'Gengraph/Stec20/stec108/stec108.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec128'), ('aln_name', 'seq18'), ('seq_path', 'Gengraph/Stec20/stec128/stec128.fa'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'stec141'), ('aln_name', 'seq19'), ('seq_path', 'Gengraph/Stec20/stec141/stec141.fa'), ('annotation_path', 'NA')])] Conducting progressiveMauve Caught signal 11 Cleaning up and exiting! Temporary files deleted. progressiveMauve Complete Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 124, in genome_aln_graph = bbone_to_initGraph(bbone_file, parsed_input_dict) File "/home/bhookund/GenGraph/gengraph.py", line 1616, in bbone_to_initGraph backbone_lol = input_parser(bbone_file) File "/home/bhookund/GenGraph/gengraph.py", line 1189, in input_parser in_file = open(file_path, 'r') FileNotFoundError: [Errno 2] No such file or directory: 'globalAlignment_stec.backbone' [bhookund@n29 ~]$

Still struggling with it. Cheers, Devina

On 14 Sep 2020, at 15:19, Jambler <notifications@github.com mailto:notifications@github.com> wrote:

Hi Devina,

I have put out a new version of the code that allows you to specify scratch paths which should help with the cluster memory problem.

From your output above, it looks like you didn't specify the sequences you wanted to align with the --seq_file flag. This should be mandatory, I'll update that in the code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-691988563, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWWWW5HKSPZ3JYOAAJLSFX345ANCNFSM4N3AMRKA.

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 3 years ago

Hi,

I am again having the following error and I have attached my Myco.txt file.

Please help.

Creating genome graph [OrderedDict([('seq_name', 'myco1'), ('aln_name', 'seq0'), ('seq_path', 'Documents/Myco/myco1/myco1.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'myco2'), ('aln_name', 'seq1'), ('seq_path', 'Documents/Myco/myco2/myco2.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'myco3'), ('aln_name', 'seq2'), ('seq_path', 'Documents/Myco/myco3/myco3.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'myco4'), ('aln_name', 'seq3'), ('seq_path', 'Documents/Myco/myco4/myco4.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'myco5'), ('aln_name', 'seq4'), ('seq_path', 'Documents/Myco/myco5/myco5.fasta'), ('annotation_path', 'NA')])] Conducting progressiveMauve progressiveMauve Complete Conducting local node realignment Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 159, in add_graph_data(genome_aln_graph) File "/home/yasmina/GenGraph/gengraph.py", line 2632, in add_graph_data if abs(int(data[an_isolate + '_leftend'])) == 1: KeyError: '_leftend'

Log output (last few lines)

INFO:root:Adding sequences INFO:root:checking nodes INFO:root:Fast local node realign: Aln_1970 INFO:root:{'myco1_leftend': 5465160, 'myco1_rightend': 5471303, 'myco2_leftend': 2163056, 'myco2_rightend': 2169201, 'myco4_leftend': 4770664, 'myco4_rightend': 4776808, 'myco5_leftend': 4819442, 'myco5_rightend': 4825586, 'ids': 'myco1,myco2,myco4,myco5', 'name': 'Aln_1970'} INFO:root:conducting mafft alignment INFO:root:6145 INFO:root:Adding sequences INFO:root:checking nodes INFO:root:Fast local node realign: Aln_1971 INFO:root:{'myco1_leftend': 5471304, 'myco1_rightend': 5471307, 'myco4_leftend': 4776809, 'myco4_rightend': 4776812, 'myco5_leftend': 4825587, 'myco5_rightend': 4825590, 'ids': 'myco1,myco4,myco5', 'name': 'Aln_1971'} INFO:root:conducting mafft alignment INFO:root:4 INFO:root:Adding sequences INFO:root:checking nodes INFO:root:Fast local node realign: Aln_1972 INFO:root:{'myco1_leftend': 5471308, 'myco1_rightend': 5475490, 'myco2_leftend': 2169260, 'myco2_rightend': 2173448, 'myco4_leftend': 4776813, 'myco4_rightend': 4781002, 'myco5_leftend': 4825591, 'myco5_rightend': 4829781, 'ids': 'myco1,myco2,myco4,myco5', 'name': 'Aln_1972'} INFO:root:conducting mafft alignment INFO:root:4191 INFO:root:Adding sequences INFO:root:checking nodes INFO:root:Aln_79 INFO:root:{'myco1_leftend': 183336, 'myco1_rightend': 183336, 'ids': 'myco1', 'name': 'Aln_79'} INFO:root:Aln_100 INFO:root:{'ids': '', 'name': 'Aln_100'}

Thank you

Cheers,

Devina

Good thoughts precede great deeds

Great deeds precede success

-- http://www.uom.ac.mu/index.php/email-disclaimer

seq_name aln_name seq_path annotation_path myco1 seq0 Documents/Myco/myco1/myco1.fasta NA myco2 seq1 Documents/Myco/myco2/myco2.fasta NA myco3 seq2 Documents/Myco/myco3/myco3.fasta NA myco4 seq3 Documents/Myco/myco4/myco4.fasta NA myco5 seq4 Documents/Myco/myco5/myco5.fasta NA

jambler24 commented 3 years ago

Hi Devina,

It looks like it is not a full path perhaps?

Documents/Myco/myco5/myco5.fasta

Depending on the OS, please just check if /Documents/Myco/myco5/myco5.fasta is not meant to be something like:

/Users/your_username/Documents/Myco/myco5/myco5.fasta

Devinaseeruttun commented 3 years ago

Thank you for your reply I am working on linux on a workstation I'll check and let you know. Thank you again Cheers Devina

On Thu, 12 Nov 2020, 18:10 Jambler notifications@github.com wrote:

Hi Devina,

It looks like it is not a full path perhaps?

Documents/Myco/myco5/myco5.fasta

Depending on the OS, please just check if /Documents/Myco/myco5/myco5.fasta is not meant to be something like:

/Users/your_username/Documents/Myco/myco5/myco5.fasta

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-726101371, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWW2GPDYFUFR4ANAKSTSPPUHDANCNFSM4N3AMRKA .

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 3 years ago

Hi I am still having the same problem. Did countercheck it with other sequences? It works. Strange. Can you advise? Thank you Cheers Devina

On Thu, 12 Nov 2020, 18:25 Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu wrote:

Thank you for your reply I am working on linux on a workstation I'll check and let you know. Thank you again Cheers Devina

On Thu, 12 Nov 2020, 18:10 Jambler notifications@github.com wrote:

Hi Devina,

It looks like it is not a full path perhaps?

Documents/Myco/myco5/myco5.fasta

Depending on the OS, please just check if /Documents/Myco/myco5/myco5.fasta is not meant to be something like:

/Users/your_username/Documents/Myco/myco5/myco5.fasta

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-726101371, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWW2GPDYFUFR4ANAKSTSPPUHDANCNFSM4N3AMRKA .

-- http://www.uom.ac.mu/index.php/email-disclaimer

Devinaseeruttun commented 3 years ago

Hi I remember having this problem initially and the problem was related to the networkx version. The networkx is 2.5. Can you help? Thank you. Kind Regards Devina

---------- Forwarded message --------- From: Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu Date: Wed, 3 Mar 2021, 12:44 Subject: gengraph To: Devina Bhookhun-Seeruttun bhookhund@uom.ac.mu

yasmina@CZC025716G:~$ python3 GenGraph/gengraphTool.py make_genome_graph --seq_file Documents/strep.txt --out_file_name strep --progressiveMauve_path ~/bin/progressiveMauve Running GenGraph Toolkit Creating genome graph [OrderedDict([('seq_name', 'strep1'), ('aln_name', 'seq0'), ('seq_path', 'Documents/Streptococcus/strep1/strep1.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep2'), ('aln_name', 'seq1'), ('seq_path', 'Documents/Streptococcus/strep2/strep2.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep3'), ('aln_name', 'seq2'), ('seq_path', 'Documents/Streptococcus/strep3/strep3.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep4'), ('aln_name', 'seq3'), ('seq_path', 'Documents/Streptococcus/strep4/strep4.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep13'), ('aln_name', 'seq4'), ('seq_path', 'Documents/Streptococcus/strep13/strep13.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep14'), ('aln_name', 'seq5'), ('seq_path', 'Documents/Streptococcus/strep14/strep14.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep15'), ('aln_name', 'seq6'), ('seq_path', 'Documents/Streptococcus/strep15/strep15.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep16'), ('aln_name', 'seq7'), ('seq_path', 'Documents/Streptococcus/strep16/strep16.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep18'), ('aln_name', 'seq8'), ('seq_path', 'Documents/Streptococcus/strep18/strep18.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep19'), ('aln_name', 'seq9'), ('seq_path', 'Documents/Streptococcus/strep19/strep19.fasta'), ('annotation_path', 'NA')]), OrderedDict([('seq_name', 'strep20'), ('aln_name', 'seq10'), ('seq_path', 'Documents/Streptococcus/strep20/strep20.fasta'), ('annotation_path', 'NA')])] Conducting progressiveMauve progressiveMauve Complete Conducting local node realignment Traceback (most recent call last): File "GenGraph/gengraphTool.py", line 159, in add_graph_data(genome_aln_graph) File "/home/yasmina/GenGraph/gengraph.py", line 2632, in add_graph_data if abs(int(data[an_isolate + '_leftend'])) == 1: KeyError: '_leftend'

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 3 years ago

Hi Devina,

I put out a new version of the code, looking to squash some bugs. Hopefully it helped with this!

Just need to update the docker container, which should allow you to run without worrying about networkx versions.

Will update here when the container is ready.

Devinaseeruttun commented 3 years ago

Thank you for the update. Cherrs Devina

On Fri, 19 Mar 2021, 11:22 Jambler @.***> wrote:

Hi Devina,

I put out a new version of the code, looking to squash some bugs. Hopefully it helped with this!

Just need to update the docker container, which should allow you to run without worrying about networkx versions.

Will update here when the container is ready.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jambler24/GenGraph/issues/16#issuecomment-802613566, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3QNWT5SGXGVPS5KDL2I7TTEL3S3ANCNFSM4N3AMRKA .

-- http://www.uom.ac.mu/index.php/email-disclaimer

jambler24 commented 3 years ago

Just ran a test with networkx 2.5, looks like it worked?

Maybe try using the docker container? that way we know it is not a software version issue?