smirarab / sepp

Ensemble of HMM methods (SEPP, TIPP, UPP)
GNU General Public License v3.0
85 stars 38 forks source link

Getting a strange error: Exception: expected str, bytes or os.PathLike object, not dict #70

Closed namsca closed 5 years ago

namsca commented 5 years ago

Hi,

I couldn't seem to get on the google groups page so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt

Any ideas what might be happening? The command I am using is:

python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py \ -t RAxML_ezread.nwk \ -a microbacteriaceae-aligned-short-names.fasta \ -f rep-seqs.fasta \ -r RAxML_info.txt \ -o five

I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?

smirarab commented 5 years ago

hmm. I have to test this more. No idea. Are the input files small enough that you can share them with me (perhaps outside github and sending them directly to me)?

On Fri, May 31, 2019 at 10:53 AM namsca notifications@github.com wrote:

Hi,

I couldn't seem to get on the google groups page https://groups.google.com/forum/#!forum/ensemble-of-hmms so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt https://github.com/smirarab/sepp/files/3242326/e6951411.txt

Any ideas what might be happening? The command I am using is:

python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f rep-seqs.fasta -r RAxML_info.txt -o five

I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAGJXOAYRMAMUCPSXCNZLN3PYFQ2PA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXAMTLQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGJXOGO4KIHJZDO7KT6Y3LPYFQ2PANCNFSM4HR4Y5LA .

-- Siavash Mirarab

MGNute commented 5 years ago

I’m not sure why this is happening but I think the problem is when it gets to the “with open(job.result, 'r') as f” line. I’m pretty sure the job it’s looking for the results of there is the hmmsearch job, and under default behavior that does not get written to a file (unless that’s changed on the master branch, which I have been away from for a while), which is why the job.result object is not a file path.

The question is why tho. When the job gets created there are options that define that behavior, and I’m not sure why they’d get switched unexpectedly.

One thing you could try is to run it with the sepp option to keep temporary files. I usually use that option regardless. But I’m not sure why that would lead to this.

On Sat, Jun 1, 2019 at 9:41 PM Siavash Mirarab notifications@github.com wrote:

hmm. I have to test this more. No idea. Are the input files small enough that you can share them with me (perhaps outside github and sending them directly to me)?

On Fri, May 31, 2019 at 10:53 AM namsca notifications@github.com wrote:

Hi,

I couldn't seem to get on the google groups page https://groups.google.com/forum/#!forum/ensemble-of-hmms so apologies if this is the wrong place for troubleshooting but I can't seem to get around this strange error I keep getting, no matter what I change. e6951411.txt < https://github.com/smirarab/sepp/files/3242326/e6951411.txt>

Any ideas what might be happening? The command I am using is:

python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f rep-seqs.fasta -r RAxML_info.txt -o five

I am attempting to work out exactly what is being placed where by manually changing the query fasta file and then using guppy to_csv to look at the edge_num to see where they are being placed and SEPP was working fine yesterday so I wondered if it was because of the short length of query fasta?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAGJXOAYRMAMUCPSXCNZLN3PYFQ2PA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GXAMTLQ , or mute the thread < https://github.com/notifications/unsubscribe-auth/AAGJXOGO4KIHJZDO7KT6Y3LPYFQ2PANCNFSM4HR4Y5LA

.

-- Siavash Mirarab

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AAFZ47NTXJSSWV2T4DWKRODPYMQMFA5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWXL22A#issuecomment-497991016, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFZ47OYVP2N2QIVNBOD75LPYMQMFANCNFSM4HR4Y5LA .

-- Michael Nute Mike.Nute@gmail.com

namsca commented 5 years ago

Hi - thanks for getting back to me!

As an update it looks to be that the length of the file is the key factor in having this work, as I ran a small subset of the fasta that worked previously cut down to 300 lines and it worked, but when I cut this down to 200 lines I got the above error.

300-seqs.fasta.txt

I have attached the file that worked, if you would like to see if you can reproduce this error I ran the following command to make it only 200 lines:

sed '201,300d' 300-seqs.fasta > 200-seqs.fasta

At which point the following command stopped working:

python /data/users/nscales/miniconda3/pkgs/sepp-4.3.8-py37_0/bin/run_sepp.py -t RAxML_ezread.nwk -a microbacteriaceae-aligned-short-names.fasta -f 200-seqs.fasta -r RAxML_info.txt

My tree has 69 sequences (138 lines) - could it be something to do with having a query fasta which is only slightly bigger than the tree itself? If this still seems strange let me know and I can send you the input files.

Thanks, Nick

smirarab commented 5 years ago

Can you also share with me RAxML_ezread.nwk, microbacteriaceae-aligned-short-names.fasta, and RAxML_info.txt? I need to be able to reproduce the error in order to debug it. Thanks

smirarab commented 5 years ago

nevermind. I detected the source of error and can reproduce with another file. Will fix soon.

smirarab commented 5 years ago

Fixed in version 4.9.10, commit: bd26318e7857a98c5917a1b0c7b97aa4a9096e2c

Nick, thanks a lot for the bug report. Was helpful to catch a corner case that was not tested before.

sjanssen2 commented 5 years ago

Hi Siavash,

if you create a new release for this version, I can build an updated bioconda package.

Best, Stefan

On 8/17/19 2:49 AM, Siavash Mirarab wrote:

Fixed in version 4.9.10, commit: bd26318 https://github.com/smirarab/sepp/commit/bd26318e7857a98c5917a1b0c7b97aa4a9096e2c

Nick, thanks a lot for the bug report. Was helpful to catch a corner case that was not tested before.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/smirarab/sepp/issues/70?email_source=notifications&email_token=AC3ICKBTFPVYEQXOLNXPK5DQE5DI3A5CNFSM4HR4Y5LKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4P7W2Y#issuecomment-522189675, or mute the thread https://github.com/notifications/unsubscribe-auth/AC3ICKAN2ME36MYI3KS3KFLQE5DI3ANCNFSM4HR4Y5LA.

smirarab commented 5 years ago

Release was created