Open Salonijain27 opened 4 years ago
Compared the forest built in dask RF with the ones built in non-dask RF implementation. The non dask forests when built with using the same seed values as used in dask RF workers created almost identical forests.
This issue could be due to the fact that on changing the seed value for the forest we can see that the RF model accuracy varies a lot. Since each worker has a different seed the forest created in each worker is different. This could be affecting the overall accuracy of the Dask RF model.
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.
This test example was taken from test/dask/test_random_forest.py and modified to scale the number of samples and number of estimators with increase in number of gpus.
Looking into it further