NicolasHug / Surprise

A Python scikit for building and analyzing recommender systems
http://surpriselib.com
BSD 3-Clause "New" or "Revised" License
6.28k stars 1k forks source link

Error: Sample larger than population or is negative? #457

Open oanaale95 opened 1 year ago

oanaale95 commented 1 year ago

Hi, apologies in advance if my tries of solving this were not enough, I am a taking my first ML course.

I am trying to use Surprise to create a recommender system. when using the data_loader method of Dataset, I get ValueError: Sample larger than population or is negative.

I was not able to find this in the repo or online.. What can I do to fix it?

{ "name": "ValueError", "message": "Sample larger than population or is negative", "stack": "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)\nCell \u001b[1;32mIn [29], line 7\u001b[0m\n\u001b[0;32m 4\u001b[0m reader \u001b[39m=\u001b[39m Reader(rating_scale\u001b[39m=\u001b[39m(\u001b[39m1\u001b[39m, \u001b[39m4\u001b[39m))\n\u001b[0;32m 6\u001b[0m \u001b[39m# Loads Pandas dataframe\u001b[39;00m\n\u001b[1;32m----> 7\u001b[0m data \u001b[39m=\u001b[39m Dataset\u001b[39m.\u001b[39mload_from_df(train_data_groupped_by_event[[\u001b[39m\"\u001b[39m\u001b[39msession\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39marticle_id\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39mrating\u001b[39m\u001b[39m\"\u001b[39m]], reader)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\surprise\dataset.py:167\u001b[0m, in \u001b[0;36mDataset.load_from_df\u001b[1;34m(cls, df, reader)\u001b[0m\n\u001b[0;32m 150\u001b[0m \u001b[39m@classmethod\u001b[39m\n\u001b[0;32m 151\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mload_from_df\u001b[39m(\u001b[39mcls\u001b[39m, df, reader):\n\u001b[0;32m 152\u001b[0m \u001b[39m\"\"\"Load a dataset from a pandas dataframe.\u001b[39;00m\n\u001b[0;32m 153\u001b[0m \n\u001b[0;32m 154\u001b[0m \u001b[39m Use this if you want to use a custom dataset that is stored in a pandas\u001b[39;00m\n\u001b0m \u001b[0m\n\u001b[0;32m 164\u001b[0m \u001b[39m specified.\u001b[39;00m\n\u001b[0;32m 165\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m--> 167\u001b[0m \u001b[39mreturn\u001b[39;00m DatasetAutoFolds(reader\u001b[39m=\u001b[39;49mreader, df\u001b[39m=\u001b[39;49mdf)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\surprise\dataset.py:262\u001b[0m, in \u001b[0;36mDatasetAutoFolds.init\u001b[1;34m(self, ratings_file, reader, df)\u001b[0m\n\u001b[0;32m 260\u001b[0m \u001b[39melif\u001b[39;00m df \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[0;32m 261\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf \u001b[39m=\u001b[39m df\n\u001b[1;32m--> 262\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mraw_ratings \u001b[39m=\u001b[39m [\n\u001b[0;32m 263\u001b[0m (uid, iid, \u001b[39mfloat\u001b[39m(r), \u001b[39mNone\u001b[39;00m)\n\u001b[0;32m 264\u001b[0m \u001b[39mfor\u001b[39;00m (uid, iid, r) \u001b[39min\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf\u001b[39m.\u001b[39mitertuples(index\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m)\n\u001b[0;32m 265\u001b[0m ]\n\u001b[0;32m 266\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 267\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\u001b[39m\"\u001b[39m\u001b[39mMust specify ratings file or dataframe.\u001b[39m\u001b[39m\"\u001b[39m)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\surprise\dataset.py:262\u001b[0m, in \u001b[0;36m\u001b[1;34m(.0)\u001b[0m\n\u001b[0;32m 260\u001b[0m \u001b[39melif\u001b[39;00m df \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[0;32m 261\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf \u001b[39m=\u001b[39m df\n\u001b[1;32m--> 262\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mraw_ratings \u001b[39m=\u001b[39m [\n\u001b[0;32m 263\u001b[0m (uid, iid, \u001b[39mfloat\u001b[39m(r), \u001b[39mNone\u001b[39;00m)\n\u001b[0;32m 264\u001b[0m \u001b[39mfor\u001b[39;00m (uid, iid, r) \u001b[39min\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf\u001b[39m.\u001b[39mitertuples(index\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m)\n\u001b[0;32m 265\u001b[0m ]\n\u001b[0;32m 266\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 267\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\u001b[39m\"\u001b[39m\u001b[39mMust specify ratings file or dataframe.\u001b[39m\u001b[39m\"\u001b[39m)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\dataframe.py:1286\u001b[0m, in \u001b[0;36mDataFrame.itertuples\u001b[1;34m(self, index, name)\u001b[0m\n\u001b[0;32m 1283\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mnext\u001b[39m(s\u001b[39m.\u001b[39m_to_pandas()\u001b[39m.\u001b[39mto_frame()\u001b[39m.\u001b[39mT\u001b[39m.\u001b[39mitertuples(index\u001b[39m=\u001b[39mindex, name\u001b[39m=\u001b[39mname))\n\u001b[0;32m 1285\u001b[0m partition_iterator \u001b[39m=\u001b[39m PartitionIterator(\u001b[39mself\u001b[39m, \u001b[39m0\u001b[39m, itertuples_builder)\n\u001b[1;32m-> 1286\u001b[0m \u001b[39mfor\u001b[39;00m v \u001b[39min\u001b[39;00m partition_iterator:\n\u001b[0;32m 1287\u001b[0m \u001b[39myield\u001b[39;00m v\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\iterator.py:70\u001b[0m, in \u001b[0;36mPartitionIterator.next\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 61\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 62\u001b[0m \u001b[39mImplement iterator interface.\u001b[39;00m\n\u001b[0;32m 63\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 67\u001b[0m \u001b[39m Incremented iterator object.\u001b[39;00m\n\u001b[0;32m 68\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 69\u001b[0m key \u001b[39m=\u001b[39m \u001b[39mnext\u001b[39m(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mindex_iter)\n\u001b[1;32m---> 70\u001b[0m df \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mdf\u001b[39m.\u001b[39;49miloc[key]\n\u001b[0;32m 71\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mfunc(df)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(*args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\indexing.py:1067\u001b[0m, in \u001b[0;36m_iLocIndexer.getitem\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m 1063\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_handle_boolean_masking(row_loc, col_loc)\n\u001b[0;32m 1065\u001b[0m row_lookup, col_lookup \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_compute_lookup(row_loc, col_loc)\n\u001b[1;32m-> 1067\u001b[0m result \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_getitem_positional(\n\u001b[0;32m 1068\u001b[0m row_lookup,\n\u001b[0;32m 1069\u001b[0m col_lookup,\n\u001b[0;32m 1070\u001b[0m row_multiindex_full_lookup\u001b[39m=\u001b[39;49m\u001b[39mFalse\u001b[39;49;00m,\n\u001b[0;32m 1071\u001b[0m col_multiindex_full_lookup\u001b[39m=\u001b[39;49m\u001b[39mFalse\u001b[39;49;00m,\n\u001b[0;32m 1072\u001b[0m row_scalar\u001b[39m=\u001b[39;49mrow_scalar,\n\u001b[0;32m 1073\u001b[0m col_scalar\u001b[39m=\u001b[39;49mcol_scalar,\n\u001b[0;32m 1074\u001b[0m ndim\u001b[39m=\u001b[39;49mndim,\n\u001b[0;32m 1075\u001b[0m )\n\u001b[0;32m 1077\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(result, Series):\n\u001b[0;32m 1078\u001b[0m result\u001b[39m.\u001b[39m_parent \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\indexing.py:404\u001b[0m, in \u001b[0;36m_LocationIndexerBase._getitem_positional\u001b[1;34m(self, row_lookup, col_lookup, row_multiindex_full_lookup, col_multiindex_full_lookup, row_scalar, col_scalar, ndim)\u001b[0m\n\u001b[0;32m 394\u001b[0m axis \u001b[39m=\u001b[39m (\n\u001b[0;32m 395\u001b[0m \u001b[39mNone\u001b[39;00m\n\u001b[0;32m 396\u001b[0m \u001b[39mif\u001b[39;00m (col_scalar \u001b[39mand\u001b[39;00m row_scalar)\n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 400\u001b[0m \u001b[39melse\u001b[39;00m \u001b[39m0\u001b[39m\n\u001b[0;32m 401\u001b[0m )\n\u001b[0;32m 403\u001b[0m res_df \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdf\u001b[39m.\u001b[39mconstructor(query_compiler\u001b[39m=\u001b[39mqc_view)\n\u001b[1;32m--> 404\u001b[0m \u001b[39mreturn\u001b[39;00m res_df\u001b[39m.\u001b[39;49msqueeze(axis\u001b[39m=\u001b[39;49maxis)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\dataframe.py:2037\u001b[0m, in \u001b[0;36mDataFrame.squeeze\u001b[1;34m(self, axis)\u001b[0m\n\u001b[0;32m 2035\u001b[0m \u001b[39mreturn\u001b[39;00m Series(query_compiler\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_query_compiler)\n\u001b[0;32m 2036\u001b[0m \u001b[39mif\u001b[39;00m axis \u001b[39m==\u001b[39m \u001b[39m0\u001b[39m \u001b[39mand\u001b[39;00m \u001b[39mlen\u001b[39m(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mindex) \u001b[39m==\u001b[39m \u001b[39m1\u001b[39m:\n\u001b[1;32m-> 2037\u001b[0m \u001b[39mreturn\u001b[39;00m Series(query_compiler\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49mT\u001b[39m.\u001b[39m_query_compiler)\n\u001b[0;32m 2038\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 2039\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcopy()\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\base.py:3429\u001b[0m, in \u001b[0;36mBasePandasDataset.getattribute\u001b[1;34m(self, item)\u001b[0m\n\u001b[0;32m 3415\u001b[0m \u001b[39m@disable_logging\u001b[39m\n\u001b[0;32m 3416\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mgetattribute\u001b[39m(\u001b[39mself\u001b[39m, item):\n\u001b[0;32m 3417\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 3418\u001b[0m \u001b[39m Return item from the BasePandasDataset.\u001b[39;00m\n\u001b[0;32m 3419\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 3427\u001b[0m \u001b[39m Any\u001b[39;00m\n\u001b[0;32m 3428\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m-> 3429\u001b[0m attr \u001b[39m=\u001b[39m \u001b[39msuper\u001b[39;49m()\u001b[39m.\u001b[39;49m\u001b[39mgetattribute\u001b[39;49m(item)\n\u001b[0;32m 3430\u001b[0m \u001b[39mif\u001b[39;00m item \u001b[39mnot\u001b[39;00m \u001b[39min\u001b[39;00m _DEFAULT_BEHAVIOUR \u001b[39mand\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_query_compiler\u001b[39m.\u001b[39mlazy_execution:\n\u001b[0;32m 3431\u001b[0m \u001b[39m# We default to pandas on empty DataFrames. This avoids a large amount of\u001b[39;00m\n\u001b[0;32m 3432\u001b[0m \u001b[39m# pain in underlying implementation and returns a result immediately rather\u001b[39;00m\n\u001b[0;32m 3433\u001b[0m \u001b[39m# than dealing with the edge cases that empty DataFrames have.\u001b[39;00m\n\u001b[0;32m 3434\u001b[0m \u001b[39mif\u001b[39;00m callable(attr) \u001b[39mand\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mempty \u001b[39mand\u001b[39;00m \u001b[39mhasattr\u001b[39m(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_pandas_class, item):\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\pandas\dataframe.py:538\u001b[0m, in \u001b[0;36mDataFrame.transpose\u001b[1;34m(self, copy, args)\u001b[0m\n\u001b[0;32m 533\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 534\u001b[0m \u001b[39mTranspose index and columns.\u001b[39;00m\n\u001b[0;32m 535\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 536\u001b[0m \u001b[39m# FIXME: Judging by pandas docs *args serves only compatibility purpose\u001b[39;00m\n\u001b[0;32m 537\u001b[0m \u001b[39m# and does not affect the result, we shouldn't pass it to the query compiler.\u001b[39;00m\n\u001b[1;32m--> 538\u001b[0m \u001b[39mreturn\u001b[39;00m DataFrame(query_compiler\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_query_compiler\u001b[39m.\u001b[39;49mtranspose(\u001b[39m\u001b[39;49margs))\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\storage_formats\pandas\query_compiler.py:704\u001b[0m, in \u001b[0;36mPandasQueryCompiler.transpose\u001b[1;34m(self, args, kwargs)\u001b[0m\n\u001b[0;32m 702\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mtranspose\u001b[39m(\u001b[39mself\u001b[39m, \u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs):\n\u001b[0;32m 703\u001b[0m \u001b[39m# Switch the index and columns and transpose the data within the blocks.\u001b[39;00m\n\u001b[1;32m--> 704\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mconstructor(\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_modin_frame\u001b[39m.\u001b[39;49mtranspose())\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\dataframe\pandas\dataframe\dataframe.py:125\u001b[0m, in \u001b[0;36mlazy_metadata_decorator..decorator..run_f_on_minimally_updated_metadata\u001b[1;34m(self, args, kwargs)\u001b[0m\n\u001b[0;32m 123\u001b[0m \u001b[39melif\u001b[39;00m apply_axis \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mrows\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[0;32m 124\u001b[0m obj\u001b[39m.\u001b[39m_propagate_index_objs(axis\u001b[39m=\u001b[39m\u001b[39m0\u001b[39m)\n\u001b[1;32m--> 125\u001b[0m result \u001b[39m=\u001b[39m f(\u001b[39mself\u001b[39m, \u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 126\u001b[0m \u001b[39mif\u001b[39;00m apply_axis \u001b[39mis\u001b[39;00m \u001b[39mNone\u001b[39;00m \u001b[39mand\u001b[39;00m \u001b[39mnot\u001b[39;00m transpose:\n\u001b[0;32m 127\u001b[0m result\u001b[39m.\u001b[39m_deferred_index \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_deferred_index\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\dataframe\pandas\dataframe\dataframe.py:3131\u001b[0m, in \u001b[0;36mPandasDataframe.transpose\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 3118\u001b[0m \u001b[39m@lazy_metadata_decorator\u001b[39m(apply_axis\u001b[39m=\u001b[39m\u001b[39mNone\u001b[39;00m, transpose\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m)\n\u001b[0;32m 3119\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mtranspose\u001b[39m(\u001b[39mself\u001b[39m):\n\u001b[0;32m 3120\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 3121\u001b[0m \u001b[39m Transpose the index and columns of this Modin DataFrame.\u001b[39;00m\n\u001b[0;32m 3122\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 3129\u001b[0m \u001b[39m New Modin DataFrame.\u001b[39;00m\n\u001b[0;32m 3130\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m-> 3131\u001b[0m new_partitions \u001b[39m=\u001b[39m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_partition_mgr_cls\u001b[39m.\u001b[39;49mlazy_map_partitions(\n\u001b[0;32m 3132\u001b[0m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_partitions, \u001b[39mlambda\u001b[39;49;00m df: df\u001b[39m.\u001b[39;49mT\n\u001b[0;32m 3133\u001b[0m )\u001b[39m.\u001b[39mT\n\u001b[0;32m 3134\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_dtypes \u001b[39mis\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mNone\u001b[39;00m:\n\u001b[0;32m 3135\u001b[0m new_dtypes \u001b[39m=\u001b[39m pandas\u001b[39m.\u001b[39mSeries(\n\u001b[0;32m 3136\u001b[0m np\u001b[39m.\u001b[39mfull(\u001b[39mlen\u001b[39m(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mindex), find_common_type(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mdtypes\u001b[39m.\u001b[39mvalues)),\n\u001b[0;32m 3137\u001b[0m index\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mindex,\n\u001b[0;32m 3138\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\dataframe\pandas\partitioning\partition_manager.py:58\u001b[0m, in \u001b[0;36mwait_computations_if_benchmark_mode..wait\u001b[1;34m(cls, args, kwargs)\u001b[0m\n\u001b[0;32m 55\u001b[0m \u001b[39m@wraps\u001b[39m(func)\n\u001b[0;32m 56\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mwait\u001b[39m(\u001b[39mcls\u001b[39m, \u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs):\n\u001b[0;32m 57\u001b[0m \u001b[39m\"\"\"Wait for computation results.\"\"\"\u001b[39;00m\n\u001b[1;32m---> 58\u001b[0m result \u001b[39m=\u001b[39m func(\u001b[39mcls\u001b[39m, \u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 59\u001b[0m \u001b[39mif\u001b[39;00m BenchmarkMode\u001b[39m.\u001b[39mget():\n\u001b[0;32m 60\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39misinstance\u001b[39m(result, \u001b[39mtuple\u001b[39m):\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\dataframe\pandas\partitioning\partition_manager.py:521\u001b[0m, in \u001b[0;36mPandasDataframePartitionManager.lazy_map_partitions\u001b[1;34m(cls, partitions, map_func)\u001b[0m\n\u001b[0;32m 503\u001b[0m \u001b[39m@classmethod\u001b[39m\n\u001b[0;32m 504\u001b[0m \u001b[39m@wait_computations_if_benchmark_mode\u001b[39m\n\u001b[0;32m 505\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mlazy_map_partitions\u001b[39m(\u001b[39mcls\u001b[39m, partitions, map_func):\n\u001b[0;32m 506\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 507\u001b[0m \u001b[39m Apply map_func to every partition in partitions lazily.\u001b[39;00m\n\u001b[0;32m 508\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 519\u001b[0m \u001b[39m An array of partitions\u001b[39;00m\n\u001b[0;32m 520\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m--> 521\u001b[0m preprocessed_map_func \u001b[39m=\u001b[39m \u001b[39mcls\u001b[39;49m\u001b[39m.\u001b[39;49mpreprocess_func(map_func)\n\u001b[0;32m 522\u001b[0m \u001b[39mreturn\u001b[39;00m np\u001b[39m.\u001b[39marray(\n\u001b[0;32m 523\u001b[0m [\n\u001b[0;32m 524\u001b[0m [part\u001b[39m.\u001b[39madd_to_apply_calls(preprocessed_map_func) \u001b[39mfor\u001b[39;00m part \u001b[39min\u001b[39;00m row]\n\u001b[0;32m 525\u001b[0m \u001b[39mfor\u001b[39;00m row \u001b[39min\u001b[39;00m partitions\n\u001b[0;32m 526\u001b[0m ]\n\u001b[0;32m 527\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\logging\logger_decorator.py:128\u001b[0m, in \u001b[0;36menable_logging..decorator..run_and_log\u001b[1;34m(*args, kwargs)\u001b[0m\n\u001b[0;32m 113\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 114\u001b[0m \u001b[39mCompute function with logging if Modin logging is enabled.\u001b[39;00m\n\u001b[0;32m 115\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 125\u001b[0m \u001b[39mAny\u001b[39;00m\n\u001b[0;32m 126\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 127\u001b[0m \u001b[39mif\u001b[39;00m LogMode\u001b[39m.\u001b[39mget() \u001b[39m==\u001b[39m \u001b[39m\"\u001b[39m\u001b[39mdisable\u001b[39m\u001b[39m\"\u001b[39m:\n\u001b[1;32m--> 128\u001b[0m \u001b[39mreturn\u001b[39;00m obj(\u001b[39m\u001b[39margs, \u001b[39m\u001b[39m\u001b[39m*\u001b[39mkwargs)\n\u001b[0;32m 130\u001b[0m logger \u001b[39m=\u001b[39m get_logger()\n\u001b[0;32m 131\u001b[0m logger_level \u001b[39m=\u001b[39m \u001b[39mgetattr\u001b[39m(logger, log_level)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\dataframe\pandas\partitioning\partition_manager.py:120\u001b[0m, in \u001b[0;36mPandasDataframePartitionManager.preprocess_func\u001b[1;34m(cls, map_func)\u001b[0m\n\u001b[0;32m 93\u001b[0m \u001b[39m@classmethod\u001b[39m\n\u001b[0;32m 94\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mpreprocess_func\u001b[39m(\u001b[39mcls\u001b[39m, map_func):\n\u001b[0;32m 95\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 96\u001b[0m \u001b[39m Preprocess a function to be applied to PandasDataframePartition objects.\u001b[39;00m\n\u001b[0;32m 97\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 118\u001b[0m \u001b[39m you are using does not require any modification to a given function.\u001b[39;00m\n\u001b[0;32m 119\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m--> 120\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mcls\u001b[39;49m\u001b[39m.\u001b[39;49m_partition_class\u001b[39m.\u001b[39;49mpreprocess_func(map_func)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\execution\dask\implementations\pandas_on_dask\partitioning\partition.py:257\u001b[0m, in \u001b[0;36mPandasOnDaskDataframePartition.preprocess_func\u001b[1;34m(cls, func)\u001b[0m\n\u001b[0;32m 242\u001b[0m \u001b[39m@classmethod\u001b[39m\n\u001b[0;32m 243\u001b[0m \u001b[39mdef\u001b[39;00m \u001b[39mpreprocess_func\u001b[39m(\u001b[39mcls\u001b[39m, func):\n\u001b[0;32m 244\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 245\u001b[0m \u001b[39m Preprocess a function before an apply call.\u001b[39;00m\n\u001b[0;32m 246\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 255\u001b[0m \u001b[39m An object that can be accepted by apply.\u001b[39;00m\n\u001b[0;32m 256\u001b[0m \u001b[39m \"\"\"\u001b[39;00m\n\u001b[1;32m--> 257\u001b[0m \u001b[39mreturn\u001b[39;00m DaskWrapper\u001b[39m.\u001b[39;49mput(func, \u001b[39mhash\u001b[39;49m\u001b[39m=\u001b[39;49m\u001b[39mFalse\u001b[39;49;00m, broadcast\u001b[39m=\u001b[39;49m\u001b[39mTrue\u001b[39;49;00m)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\modin\core\execution\dask\common\engine_wrapper.py:98\u001b[0m, in \u001b[0;36mDaskWrapper.put\u001b[1;34m(cls, data, *kwargs)\u001b[0m\n\u001b[0;32m 83\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 84\u001b[0m \u001b[39mPut data into distributed memory.\u001b[39;00m\n\u001b[0;32m 85\u001b[0m \n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 95\u001b[0m \u001b[39mList, dict, iterator, or queue of futures matching the type of input.\u001b[39;00m\n\u001b[0;32m 96\u001b[0m \u001b[39m\"\"\"\u001b[39;00m\n\u001b[0;32m 97\u001b[0m client \u001b[39m=\u001b[39m default_client()\n\u001b[1;32m---> 98\u001b[0m \u001b[39mreturn\u001b[39;00m client\u001b[39m.\u001b[39mscatter(data, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\client.py:2506\u001b[0m, in \u001b[0;36mClient.scatter\u001b[1;34m(self, data, workers, broadcast, direct, hash, timeout, asynchronous)\u001b[0m\n\u001b[0;32m 2504\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 2505\u001b[0m local_worker \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m\n\u001b[1;32m-> 2506\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49msync(\n\u001b[0;32m 2507\u001b[0m \u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_scatter,\n\u001b[0;32m 2508\u001b[0m data,\n\u001b[0;32m 2509\u001b[0m workers\u001b[39m=\u001b[39;49mworkers,\n\u001b[0;32m 2510\u001b[0m broadcast\u001b[39m=\u001b[39;49mbroadcast,\n\u001b[0;32m 2511\u001b[0m direct\u001b[39m=\u001b[39;49mdirect,\n\u001b[0;32m 2512\u001b[0m local_worker\u001b[39m=\u001b[39;49mlocal_worker,\n\u001b[0;32m 2513\u001b[0m timeout\u001b[39m=\u001b[39;49mtimeout,\n\u001b[0;32m 2514\u001b[0m asynchronous\u001b[39m=\u001b[39;49masynchronous,\n\u001b[0;32m 2515\u001b[0m \u001b[39mhash\u001b[39;49m\u001b[39m=\u001b[39;49m\u001b[39mhash\u001b[39;49m,\n\u001b[0;32m 2516\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\utils.py:339\u001b[0m, in \u001b[0;36mSyncMethodMixin.sync\u001b[1;34m(self, func, asynchronous, callback_timeout, args, kwargs)\u001b[0m\n\u001b[0;32m 337\u001b[0m \u001b[39mreturn\u001b[39;00m future\n\u001b[0;32m 338\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m--> 339\u001b[0m \u001b[39mreturn\u001b[39;00m sync(\n\u001b[0;32m 340\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mloop, func, \u001b[39m\u001b[39margs, callback_timeout\u001b[39m=\u001b[39mcallback_timeout, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs\n\u001b[0;32m 341\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\utils.py:406\u001b[0m, in \u001b[0;36msync\u001b[1;34m(loop, func, callback_timeout, args, kwargs)\u001b[0m\n\u001b[0;32m 404\u001b[0m \u001b[39mif\u001b[39;00m error:\n\u001b[0;32m 405\u001b[0m typ, exc, tb \u001b[39m=\u001b[39m error\n\u001b[1;32m--> 406\u001b[0m \u001b[39mraise\u001b[39;00m exc\u001b[39m.\u001b[39mwith_traceback(tb)\n\u001b[0;32m 407\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 408\u001b[0m \u001b[39mreturn\u001b[39;00m result\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\utils.py:379\u001b[0m, in \u001b[0;36msync..f\u001b[1;34m()\u001b[0m\n\u001b[0;32m 377\u001b[0m future \u001b[39m=\u001b[39m asyncio\u001b[39m.\u001b[39mwait_for(future, callback_timeout)\n\u001b[0;32m 378\u001b[0m future \u001b[39m=\u001b[39m asyncio\u001b[39m.\u001b[39mensure_future(future)\n\u001b[1;32m--> 379\u001b[0m result \u001b[39m=\u001b[39m \u001b[39myield\u001b[39;00m future\n\u001b[0;32m 380\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mException\u001b[39;00m:\n\u001b[0;32m 381\u001b[0m error \u001b[39m=\u001b[39m sys\u001b[39m.\u001b[39mexc_info()\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\tornado\gen.py:762\u001b[0m, in \u001b[0;36mRunner.run\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 759\u001b[0m exc_info \u001b[39m=\u001b[39m \u001b[39mNone\u001b[39;00m\n\u001b[0;32m 761\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[1;32m--> 762\u001b[0m value \u001b[39m=\u001b[39m future\u001b[39m.\u001b[39;49mresult()\n\u001b[0;32m 763\u001b[0m \u001b[39mexcept\u001b[39;00m \u001b[39mException\u001b[39;00m:\n\u001b[0;32m 764\u001b[0m exc_info \u001b[39m=\u001b[39m sys\u001b[39m.\u001b[39mexc_info()\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\client.py:2386\u001b[0m, in \u001b[0;36mClient._scatter\u001b[1;34m(self, data, workers, broadcast, direct, local_worker, timeout, hash)\u001b[0m\n\u001b[0;32m 2382\u001b[0m \u001b[39mawait\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mscheduler\u001b[39m.\u001b[39mupdate_data(\n\u001b[0;32m 2383\u001b[0m who_has\u001b[39m=\u001b[39mwho_has, nbytes\u001b[39m=\u001b[39mnbytes, client\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mid\n\u001b[0;32m 2384\u001b[0m )\n\u001b[0;32m 2385\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[1;32m-> 2386\u001b[0m \u001b[39mawait\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mscheduler\u001b[39m.\u001b[39mscatter(\n\u001b[0;32m 2387\u001b[0m data\u001b[39m=\u001b[39mdata2,\n\u001b[0;32m 2388\u001b[0m workers\u001b[39m=\u001b[39mworkers,\n\u001b[0;32m 2389\u001b[0m client\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39mid,\n\u001b[0;32m 2390\u001b[0m broadcast\u001b[39m=\u001b[39mbroadcast,\n\u001b[0;32m 2391\u001b[0m timeout\u001b[39m=\u001b[39mtimeout,\n\u001b[0;32m 2392\u001b[0m )\n\u001b[0;32m 2394\u001b[0m out \u001b[39m=\u001b[39m {k: Future(k, \u001b[39mself\u001b[39m, inform\u001b[39m=\u001b[39m\u001b[39mFalse\u001b[39;00m) \u001b[39mfor\u001b[39;00m k \u001b[39min\u001b[39;00m data}\n\u001b[0;32m 2395\u001b[0m \u001b[39mfor\u001b[39;00m key, typ \u001b[39min\u001b[39;00m types\u001b[39m.\u001b[39mitems():\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\core.py:1163\u001b[0m, in \u001b[0;36mPooledRPCCall.getattr..send_recv_from_rpc\u001b[1;34m(kwargs)\u001b[0m\n\u001b[0;32m 1161\u001b[0m prev_name, comm\u001b[39m.\u001b[39mname \u001b[39m=\u001b[39m comm\u001b[39m.\u001b[39mname, \u001b[39m\"\u001b[39m\u001b[39mConnectionPool.\u001b[39m\u001b[39m\"\u001b[39m \u001b[39m+\u001b[39m key\n\u001b[0;32m 1162\u001b[0m \u001b[39mtry\u001b[39;00m:\n\u001b[1;32m-> 1163\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mawait\u001b[39;00m send_recv(comm\u001b[39m=\u001b[39mcomm, op\u001b[39m=\u001b[39mkey, \u001b[39m\u001b[39m\u001b[39m\u001b[39mkwargs)\n\u001b[0;32m 1164\u001b[0m \u001b[39mfinally\u001b[39;00m:\n\u001b[0;32m 1165\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mpool\u001b[39m.\u001b[39mreuse(\u001b[39mself\u001b[39m\u001b[39m.\u001b[39maddr, comm)\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\core.py:953\u001b[0m, in \u001b[0;36msend_recv\u001b[1;34m(comm, reply, serializers, deserializers, kwargs)\u001b[0m\n\u001b[0;32m 951\u001b[0m _, exc, tb \u001b[39m=\u001b[39m clean_exception(\u001b[39m\u001b[39m\u001b[39m\u001b[39mresponse)\n\u001b[0;32m 952\u001b[0m \u001b[39massert\u001b[39;00m exc\n\u001b[1;32m--> 953\u001b[0m \u001b[39mraise\u001b[39;00m exc\u001b[39m.\u001b[39mwith_traceback(tb)\n\u001b[0;32m 954\u001b[0m \u001b[39melse\u001b[39;00m:\n\u001b[0;32m 955\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mException\u001b[39;00m(response[\u001b[39m\"\u001b[39m\u001b[39mexception_text\u001b[39m\u001b[39m\"\u001b[39m])\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\core.py:771\u001b[0m, in \u001b[0;36m_handle_comm\u001b[1;34m()\u001b[0m\n\u001b[0;32m 769\u001b[0m result \u001b[39m=\u001b[39m handler(\u001b[39m\u001b[39m\u001b[39m\u001b[39mmsg)\n\u001b[0;32m 770\u001b[0m \u001b[39mif\u001b[39;00m inspect\u001b[39m.\u001b[39miscoroutine(result):\n\u001b[1;32m--> 771\u001b[0m result \u001b[39m=\u001b[39m \u001b[39mawait\u001b[39;00m result\n\u001b[0;32m 772\u001b[0m \u001b[39melif\u001b[39;00m inspect\u001b[39m.\u001b[39misawaitable(result):\n\u001b[0;32m 773\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mRuntimeError\u001b[39;00m(\n\u001b[0;32m 774\u001b[0m \u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mComm handler returned unknown awaitable. Expected coroutine, instead got \u001b[39m\u001b[39m{\u001b[39;00m\u001b[39mtype\u001b[39m(result)\u001b[39m}\u001b[39;00m\u001b[39m\"\u001b[39m\n\u001b[0;32m 775\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\scheduler.py:5707\u001b[0m, in \u001b[0;36mscatter\u001b[1;34m()\u001b[0m\n\u001b[0;32m 5705\u001b[0m \u001b[39mif\u001b[39;00m broadcast:\n\u001b[0;32m 5706\u001b[0m n \u001b[39m=\u001b[39m \u001b[39mlen\u001b[39m(nthreads) \u001b[39mif\u001b[39;00m broadcast \u001b[39mis\u001b[39;00m \u001b[39mTrue\u001b[39;00m \u001b[39melse\u001b[39;00m broadcast\n\u001b[1;32m-> 5707\u001b[0m \u001b[39mawait\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mreplicate(keys\u001b[39m=\u001b[39mkeys, workers\u001b[39m=\u001b[39mworkers, n\u001b[39m=\u001b[39mn)\n\u001b[0;32m 5709\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mlog_event(\n\u001b[0;32m 5710\u001b[0m [client, \u001b[39m\"\u001b[39m\u001b[39mall\u001b[39m\u001b[39m\"\u001b[39m], {\u001b[39m\"\u001b[39m\u001b[39maction\u001b[39m\u001b[39m\"\u001b[39m: \u001b[39m\"\u001b[39m\u001b[39mscatter\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m\"\u001b[39m\u001b[39mclient\u001b[39m\u001b[39m\"\u001b[39m: client, \u001b[39m\"\u001b[39m\u001b[39mcount\u001b[39m\u001b[39m\"\u001b[39m: \u001b[39mlen\u001b[39m(data)}\n\u001b[0;32m 5711\u001b[0m )\n\u001b[0;32m 5712\u001b[0m \u001b[39mreturn\u001b[39;00m keys\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\site-packages\distributed\scheduler.py:6516\u001b[0m, in \u001b[0;36mreplicate\u001b[1;34m()\u001b[0m\n\u001b[0;32m 6513\u001b[0m count \u001b[39m=\u001b[39m \u001b[39mmin\u001b[39m(n_missing, branching_factor \u001b[39m\u001b[39m \u001b[39mlen\u001b[39m(ts\u001b[39m.\u001b[39mwho_has))\n\u001b[0;32m 6514\u001b[0m \u001b[39massert\u001b[39;00m count \u001b[39m>\u001b[39m \u001b[39m0\u001b[39m\n\u001b[1;32m-> 6516\u001b[0m \u001b[39mfor\u001b[39;00m ws \u001b[39min\u001b[39;00m random\u001b[39m.\u001b[39msample(\u001b[39mtuple\u001b[39m(workers \u001b[39m-\u001b[39m ts\u001b[39m.\u001b[39mwho_has), count):\n\u001b[0;32m 6517\u001b[0m gathers[ws\u001b[39m.\u001b[39maddress][ts\u001b[39m.\u001b[39mkey] \u001b[39m=\u001b[39m [\n\u001b[0;32m 6518\u001b[0m wws\u001b[39m.\u001b[39maddress \u001b[39mfor\u001b[39;00m wws \u001b[39min\u001b[39;00m ts\u001b[39m.\u001b[39mwho_has\n\u001b[0;32m 6519\u001b[0m ]\n\u001b[0;32m 6521\u001b[0m \u001b[39mawait\u001b[39;00m asyncio\u001b[39m.\u001b[39mgather(\n\u001b[0;32m 6522\u001b[0m \u001b[39m\u001b[39m(\n\u001b[0;32m 6523\u001b[0m \u001b[39m# Note: this never raises exceptions\u001b[39;00m\n\u001b[1;32m (...)\u001b[0m\n\u001b[0;32m 6526\u001b[0m )\n\u001b[0;32m 6527\u001b[0m )\n\nFile \u001b[1;32mc:\Users\oanaa\anaconda3\envs\CodeBase\lib\random.py:482\u001b[0m, in \u001b[0;36msample\u001b[1;34m()\u001b[0m\n\u001b[0;32m 480\u001b[0m randbelow \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_randbelow\n\u001b[0;32m 481\u001b[0m \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39m0\u001b[39m \u001b[39m<\u001b[39m\u001b[39m=\u001b[39m k \u001b[39m<\u001b[39m\u001b[39m=\u001b[39m n:\n\u001b[1;32m--> 482\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\u001b[39m\"\u001b[39m\u001b[39mSample larger than population or is negative\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[0;32m 483\u001b[0m result \u001b[39m=\u001b[39m [\u001b[39mNone\u001b[39;00m] \u001b[39m*\u001b[39m k\n\u001b[0;32m 484\u001b[0m setsize \u001b[39m=\u001b[39m \u001b[39m21\u001b[39m \u001b[39m# size of a small set minus size of an empty list\u001b[39;00m\n\n\u001b[1;31mValueError\u001b[0m: Sample larger than population or is negative" }

NicolasHug commented 1 year ago

Hi @oanaale95 ,

It's going to be impossible for me to help without looking at some code. It looks like you copy/pasted the content of a notebook, but this doesn't render properly here.

Could you please provide a reproducible code example?