ku-cbd / PhageBoost

Rapid discovery of novel prophages using biological feature engineering and machine learning
GNU General Public License v3.0
35 stars 6 forks source link

KeyError: 'X' when processing test data #18

Closed lowel0 closed 3 years ago

lowel0 commented 3 years ago

Hi,

I have successfully install PhageBoost 1.3.1 using the pip install PhageBoost and been unable to successfully process the NC_000907.fasta.gz test data. I get the following error output:

PhageBoost -f NC_000907.fasta.gz -o results processing: NC_000907 time after genecalls: 4.93197774887085 Traceback (most recent call last): File "/home/lowel/.local/bin/PhageBoost", line 8, in sys.exit(main()) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/main.py", line 219, in main df = calculate_features(genecalls) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/main.py", line 59, in calculatefeatures df, , _ = calc_features.df2AAandDNAfeatures(genecalls, name='header') File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/calc_features.py", line 215, in df2AAandDNAfeatures DF, DF_AA, DF_DNA = RunAAandDNA(dna_entries, aa_entries, locations) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/calc_features.py", line 202, in RunAAandDNA DF_AA = RunAA(AA_entries, verbose = verbose) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/calc_features.py", line 178, in RunAA df_biopython = biopython_proteinanalysis(entries, scaling=scaling) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/calc_features.py", line 168, in biopython_proteinanalysis d = biopython_proteinanalysis_seq(seq, scaling=scaling) File "/home/lowel/.local/lib/python3.9/site-packages/PhageBoost/calc_features.py", line 33, in biopython_proteinanalysis_seq flex = np.array(res.flexibility()) File "/home/lowel/.local/lib/python3.9/site-packages/Bio/SeqUtils/ProtParam.py", line 183, in flexibility score += (flexibilities[front] + flexibilities[back]) * weights[j] KeyError: 'X'

I am using xgboost==1.1.1, but have used various permutations of python, PhageBoost and xgboost versions to no avail.

Can anybody please help with this?

Sincere regards,

tsp-kucbd commented 3 years ago

I cannot reproduce this error, but it seems something is happening inside Biopython. Can you show us which Biopython version you are using? import Bio print(Bio.__version__)

WuJiaWei121 commented 3 years ago

我无法重现此错误,但 Biopython 内部似乎发生了一些事情。您能告诉我们您使用的是哪个 Biopython 版本吗? import Bio print(Bio.__version__)

I also encountered the above problems. My bio version is 1.79. How should I solve this problem?

lowel0 commented 3 years ago

Hi,

I cannot reproduce this error, but it seems something is happening inside Biopython. Can you show us which Biopython version you are using? import Bio print(Bio.__version__)

Thank you for your response.

My Bio version is also 1.79.

Sincere regards,

...Lowel.

tsp-kucbd commented 3 years ago

I reinstalled it on a new machine and could reproduce it, it seems pyrodigal has changed its internals. You can use the git hub version for now, until we create a new pip package.

lowel0 commented 3 years ago

Hi,

Thank you for the quick investigation. I have used the "github" installation with xgboost==1.1.1 and can now succussfully process the test data.

Sincere regards,

...Lowel.

WuJiaWei121 commented 3 years ago

Hi, I don't understand the meaning of that "I have used the "github" installation with xgboost==1.1.1". Could you help explain it?

lowel0 commented 3 years ago

Hi, As part of the install process you use pip to install xgboost. By default the working version was not being installed, but this was corrected by using the command:

pip install xgboost==1.1.1

Sincere regards, ...Lowel.

WuJiaWei121 commented 3 years ago

Hi Thank you for your response. I have used pip to install xgboost==1.1.1 successfully, but it still seems to make the same mistake : KeyError: 'X' when processing test data

what's more ,I tried to replace the cal_ Features.py file, but it still doesn't seem to work. Could you have some good tips?

lowel0 commented 3 years ago

Hi,

I managed to get rid of the error by using python 3.7.0 and the development version from github rather than using the released version:

git clone https://github.com/ku-cbd/PhageBoost.git cd PhageBoost python setup.py bdist_wheel pip install . xgboost==1.1.1

The python version is important.

Sincere regards,

...Lowel.

On Tue, 23 Nov 2021 at 12:00, WuJiaWei121 @.***> wrote:

Hi Thank you for your response. I have used pip to install xgboost==1.1.1 successfully, but it still seems to make the same mistake : KeyError: 'X' when processing test data

what's more ,I tried to replace the cal_ Features.py file, but it still doesn't seem to work. Could you have some good tips?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ku-cbd/PhageBoost/issues/18#issuecomment-976443918, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEIYK3KJDOTKD3EX2662LUTUNN65NANCNFSM5DQS3OIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.