PatrickKueck / FASconCAT-G

FASconCAT-G offers a wide range of possibilities to edit and concatenate multiple nucleotide, amino acid, and structure sequence alignment files for phylogenetic and population genetic purposes. The main options include sequence renaming, file format conversion, sequence translation, consensus generation of predefined sequence blocks, and RY coding as well as site exclusions in nucleotide sequences. FASconCAT-G implemented process options can be invoked in any combination and performed during a single process run. FASconCAT-G can also read in and handle different file formats (FASTA, CLUSTAL, and PHYLIP) in a single run.
33 stars 20 forks source link

"Killed" while concatenating #3

Closed jananiharan closed 3 years ago

jananiharan commented 3 years ago

Hello,

FASconCAT abruptly stops running while concatenating sequences with the only message being "Killed". I'm running it on a Linux command line, with option -k.

Is this a memory issue? How can I fix it?

Screen Shot 2021-01-28 at 10 41 40 PM

PatrickKueck commented 3 years ago

Hi Janani,

 

it seems that you run out of enough ram memory. My only advice is to use a computer with more RAM or a smaller data set. I don't think it helps much, but maybe you can reduce the number of your sequence name characters? This would reduce the use of RAM a little bit...

 

Best

 

Patrick  


Dr. Patrick Kück Algorithmic Development & Computational Biology Zoological Research Museum Alexander Koenig (ZFMK) Adenauerallee 160, 53113 Bonn, Germany www.zfmk.de

   

Gesendet: Freitag, 29. Januar 2021 um 04:43 Uhr Von: "Janani Hariharan" notifications@github.com An: "PatrickKueck/FASconCAT-G" FASconCAT-G@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Betreff: [PatrickKueck/FASconCAT-G] "Killed" while concatenating (#3)

 

Hello,

FASconCAT abruptly stops running while concatenating sequences with the only message being "Killed". I'm running it on a Linux command line, with option -k.

Is this a memory issue? How can I fix it?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

PatrickKueck commented 3 years ago

Check also the sequence names in your infiles. sequences which should be concatenated to each other must have identical names. Otherwise, fccg identifies each sequence name as unique and concatenates for each file in which the name is different N's or X's (inducing missing data). This can blow your ram enormeously and leads for certain to an unintentional supermatrix file...  

 

Hi Janani,

 

it seems that you run out of enough ram memory. My only advice is to use a computer with more RAM or a smaller data set. I don't think it helps much, but maybe you can reduce the number of your sequence name characters? This would reduce the use of RAM a little bit...

 

Best

 

Patrick  


Dr. Patrick Kück Algorithmic Development & Computational Biology Zoological Research Museum Alexander Koenig (ZFMK) Adenauerallee 160, 53113 Bonn, Germany www.zfmk.de

   

Gesendet: Freitag, 29. Januar 2021 um 04:43 Uhr Von: "Janani Hariharan" notifications@github.com An: "PatrickKueck/FASconCAT-G" FASconCAT-G@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Betreff: [PatrickKueck/FASconCAT-G] "Killed" while concatenating (#3)

 

Hello,

FASconCAT abruptly stops running while concatenating sequences with the only message being "Killed". I'm running it on a Linux command line, with option -k.

Is this a memory issue? How can I fix it?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

jananiharan commented 3 years ago

Thanks! The sequence names were all identical and about 6 characters long each, so it must have been the RAM.

PatrickKueck commented 3 years ago
            Welcome. It’s definitely the RAM. The question is if it’s because your data is generally to big or if something of your input data is not specified correctly. A colleague of mine got also a system kill a couple of years ago because of giving the same taxon different names in each infile. The data set included more than 100 taxa and hundreds of infiles. The fccg misinterpretation of different names of the same taxon as independent taxa blew up the number of concatenated sequences in the supermatrix dramatically. But if your sequence names are all identical (including upper and lower cases) it seems that you don’t have enough ram in general. A way to overcome this problem could be a splitting of your taxon setting of each infile in two partitions. In that case, you could concatenate infiles of both taxon partitions separately (that would reduce the use of ram by a half. Afterwards you can easily copy both partition-supermatrices to one file.BestPatrick Gesendet mit meinem iPhoneAm 10.02.21 um 22:46 schrieb Janani Hariharan

                Von: "Janani Hariharan" <notifications@github.com>Datum: 10. Februar 2021An: "PatrickKueck/FASconCAT-G" <FASconCAT-G@noreply.github.com>Cc: "Comment" <comment@noreply.github.com>,"Patrick Kueck" <patrick_kueck@web.de>Betreff: Re: [PatrickKueck/FASconCAT-G] "Killed" while concatenating (#3)

Thanks! The sequence names were all identical and about 6 characters long each, so it must have been the RAM.

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe. [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/PatrickKueck/FASconCAT-G/issues/3#issuecomment-777048406", "url": "https://github.com/PatrickKueck/FASconCAT-G/issues/3#issuecomment-777048406", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } } ]