Open Darcy220606 opened 2 years ago
Thank you for considering this application. To solve this issue can you check utils.py
script and be sure that create_dataset function returns return list(dataset["sequence"]), list(dataset["label"])
NOT return list(dataset["sequences"]), list(dataset["label"])
def create_dataset(data_path: str) -> Tuple[List[str], List[int]]:
dataset = pd.read_csv(data_path)
dataset = dataset.sample(frac=1).reset_index(drop=True) # shuffle the dataset
return list(dataset["sequence"]), list(dataset["label"])
Thanks for your quick reply. I changed it accordingly but i still get a similar error: Keyerror : 'sequence'
Traceback (most recent call last): File "/conda_envs/amp-app/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError:'sequence' The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/AMP-app/AMP.py", line 141, in
No worries, it was an issue related to Data. I just uploaded the correct data all_Data.csv
.
Enjoy!!!
Hi @Hayfabm
Thanks a lot for following up on the error and updating the data and script files. Regarding the output, there doesnt seem to be the sequence identifier before the prediction results in the table ' Antimicrobial recognition.txt'. Is that the case and where can i find them?
Many thanks once again.
0.8765432098765432,0.8415841542495834,0.9113300447717732,0.7548495571138049,0.9423986733648735,0.9540515969171111 0.908641975308642,0.8571428529204785,0.9603960348495246,0.8217581174033954,0.9651758279276205,0.9714525215491914 0.9108910891089109,0.8613861343495736,0.9603960348495246,0.8258399856982515,0.9537545338692285,0.9617516423688823 0.9133663366336634,0.8910891044995589,0.9356435597245368,0.8275544702694583,0.9674296637584551,0.9682923557732053 0.8935643564356436,0.8663366293745711,0.9207920746495442,0.7882983889008571,0.9483874129987256,0.9572226613364542 0.9158415841584159,0.9059405895745516,0.9257425696745417,0.8318462754108389,0.9734094696598373,0.9760802374115075 0.8960396039603961,0.8514851442995786,0.9405940547495344,0.7952427724679769,0.957063033035977,0.9632577523953173 0.8861386138613861,0.8366336592245859,0.9356435597245368,0.7760905889694412,0.9629693167336535,0.966777086469228 0.9381188118811881,0.9158415796245466,0.9603960348495246,0.8771086301658331,0.9748799137339477,0.9784152592072608 0.8910891089108911,0.900990094549554,0.8811881144495638,0.7823316161601936,0.9611067542397804,0.9657823404985096 acc=90.30% (+/- 1.69%) sensitivity=87.28% (+/- 2.69%) specificity=93.32% (+/- 2.38%) mcc=80.81% (+/- 3.36%) roc_auc=96.07% (+/- 0.99%) roc_pr=96.63% (+/- 0.73%)
HI Hayfa,
Thanks a lot for setting up this CLI version of the amp scanner. I was trying to set it up locally but i keep getting a Key error 'sequences' . It seems that this error is linked also to the utils.py script. Im testing the tool using the training datasets you have available online and a custom test dataset by running $python AMP.py [input.fa] . Any idea how i can resolve this issue?
Many thanks for your help!!
Traceback (most recent call last): File "/conda_envs/amp-app/lib/python3.7/site-packages/pandas/core/indexes base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError ‘sequences'
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/AMP-app/AMP.py", line 141, in
sequences, labels=np.array(create_dataset(data_path=DATASET))
File "/AMP-app/utils.py", line 11, in create_dataset
return list(dataset["sequences"]), list(dataset["label"])
File "/conda_envs/amp-app/lib/python3.7/site-packages/pandas/core/frame.py", line 3455, in getitemindexer = self.columns.get_loc(key)
File "/conda_envs/amp-app/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError'sequences'