Closed namsca closed 5 years ago
hmm. I have to test this more. No idea. Are the input files small enough that you can share them with me (perhaps outside github and sending them directly to me)?
On Fri, May 31, 2019 at 10:53 AM namsca notifications@github.com wrote:
Hi,
I couldn't seem to get on the google groups page https://groups.google.com/forum/#!forum/ensemble-of-hmms so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt https://github.com/smirarab/sepp/files/3242326/e6951411.txt
Any ideas what might be happening? The command I am using is:
python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f rep-seqs.fasta -r RAxML_info.txt -o five
I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAGJXOAYRMAMUCPSXCNZLN3PYFQ2PA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXAMTLQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGJXOGO4KIHJZDO7KT6Y3LPYFQ2PANCNFSM4HR4Y5LA .
-- Siavash Mirarab
I’m not sure why this is happening but I think the problem is when it gets to the “with open(job.result, 'r') as f” line. I’m pretty sure the job it’s looking for the results of there is the hmmsearch job, and under default behavior that does not get written to a file (unless that’s changed on the master branch, which I have been away from for a while), which is why the job.result object is not a file path.
The question is why tho. When the job gets created there are options that define that behavior, and I’m not sure why they’d get switched unexpectedly.
One thing you could try is to run it with the sepp option to keep temporary files. I usually use that option regardless. But I’m not sure why that would lead to this.
On Sat, Jun 1, 2019 at 9:41 PM Siavash Mirarab notifications@github.com wrote:
hmm. I have to test this more. No idea. Are the input files small enough that you can share them with me (perhaps outside github and sending them directly to me)?
On Fri, May 31, 2019 at 10:53 AM namsca notifications@github.com wrote:
Hi,
I couldn't seem to get on the google groups page https://groups.google.com/forum/#!forum/ensemble-of-hmms so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt < https://github.com/smirarab/sepp/files/3242326/e6951411.txt>
Any ideas what might be happening? The command I am using is:
python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f rep-seqs.fasta -r RAxML_info.txt -o five
I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAGJXOAYRMAMUCPSXCNZLN3PYFQ2PA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXAMTLQ , or mute the thread < https://github.com/notifications/unsubscribe-auth/AAGJXOGO4KIHJZDO7KT6Y3LPYFQ2PANCNFSM4HR4Y5LA
.
-- Siavash Mirarab
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAFZ47NTXJSSWV2T4DWKRODPYMQMFA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWXL22A#issuecomment-497991016, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFZ47OYVP2N2QIVNBOD75LPYMQMFANCNFSM4HR4Y5LA .
-- Michael Nute Mike.Nute@gmail.com
Hi - thanks for getting back to me!
As an update it looks to be that the length of the file is the key factor in having this work, as I ran a small subset of the fasta that worked previously cut down to 300 lines and it worked, but when I cut this down to 200 lines I got the above error.
I have attached the file that worked, if you would like to see if you can reproduce this error I ran the following command to make it only 200 lines:
sed '201,300d' 300-seqs.fasta > 200-seqs.fasta
At which point the following command stopped working:
python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f 200-seqs.fasta -r RAxML_info.txt
My tree has 69 sequences (138 lines) - could it be something to do with having a query fasta which is only slightly bigger than the tree itself? If this still seems strange let me know and I can send you the input files.
Thanks, Nick
Can you also share with me
RAxML_ezread.nwk
, microbacteriaceae-aligned-short-names.fasta
, and RAxML_info.txt
? I need to be able to reproduce the error in order to debug it. Thanks
nevermind. I detected the source of error and can reproduce with another file. Will fix soon.
Fixed in version 4.9.10, commit: bd26318e7857a98c5917a1b0c7b97aa4a9096e2c
Nick, thanks a lot for the bug report. Was helpful to catch a corner case that was not tested before.
Hi Siavash,
if you create a new release for this version, I can build an updated bioconda package.
Best, Stefan
On 8/17/19 2:49 AM, Siavash Mirarab wrote:
Fixed in version 4.9.10, commit: bd26318 https://github.com/smirarab/sepp/commit/bd26318e7857a98c5917a1b0c7b97aa4a9096e2c
Nick, thanks a lot for the bug report. Was helpful to catch a corner case that was not tested before.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AC3ICKBTFPVYEQXOLNXPK5DQE5DI3A5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4P7W2Y#issuecomment-522189675, or mute the thread https://github.com/notifications/unsubscribe-auth/AC3ICKAN2ME36MYI3KS3KFLQE5DI3ANCNFSM4HR4Y5LA.
Release was created
Hi,
I couldn't seem to get on the google groups page so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt
Any ideas what might be happening? The command I am using is:
python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py \ -t RAxML_ezread.nwk \ -a microbacteriaceae-aligned-short-names.fasta \ -f rep-seqs.fasta \ -r RAxML_info.txt \ -o five
I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?