OLC-Bioinformatics / ConFindr

Intra-species bacterial contamination detection
https://olc-bioinformatics.github.io/ConFindr/
MIT License
22 stars 8 forks source link

problems with confindr rMLSt database #27

Closed KasiaTluscik closed 1 year ago

KasiaTluscik commented 2 years ago

Hi :)

I am having trouble with getting the rMLST database for confindr. following the instructions gets such an error:

Traceback (most recent call last): File "/home/msszwarc/miniconda3/envs/confindr/bin/confindr_database_setup", line 10, in sys.exit(main()) File "/home/msszwarc/miniconda3/envs/confindr/lib/python3.6/site-packages/confindr_src/database_setup.py", line 270, in main args.secret_file) File "/home/msszwarc/miniconda3/envs/confindr/lib/python3.6/site-packages/confindr_src/database_setup.py", line 209, in setup_confindr_database record.seq._data = record.seq._data.replace('-', '').replace('N', '') TypeError: a bytes-like object is required, not 'str'

adamkoziol commented 2 years ago

Hi Kasia,

I took a look at this issue, and I'm not quite sure why it's happening. The best I can figure is that something changed within BioPython, and the record.seq._data protected attribute is now encoded. The odd thing is that I could recreate this issue with my Python 3.6 environment, but not my Python 3.8 testing environment.

I ended up replacing line 209 in database_setup.py with a try/except to address this issue. This will eventually be reflected in the bioconda package, but in the meantime, please replace line 209 (record.seq._data = record.seq._data.replace('-', '').replace('N', '')) in "/home/msszwarc/miniconda3/envs/confindr/lib/python3.6/site-packages/confindr_src/database_setup.py" with record.seq._data = str(record.seq._data).replace('-', '').replace('N', '').encode()

Please let me know if this addresses your issue.

Best regards, and good luck, A

jenmuell commented 2 years ago

Hello A,

I ran into the same problem with both Python versions (3.7 and 3.8). Your code snippet solved the problem for Python 3.8 and I haven't tried it with Python 3.7.

pimarin commented 1 year ago

Hello, I encounted the same problem and I modified the python script by removing the _data before record.seq._data = str(record.seq._data).replace('-', '').replace('N', '').encode() after record.seq = str(record.seq).replace('-', '').replace('N', '').encode()

because the replace work on object but the record.seq._data give a string this solve the problem and work also fine

pcrxn commented 1 year ago

Fixed by 19d0d1d for v0.8.1.