vickyyin1493 / Replication-FFJR

0 stars 0 forks source link

Problem 6: code is taking a long time to run (with no results yet) #12

Open vickyyin1493 opened 1 year ago

vickyyin1493 commented 1 year ago

After our meeting last week, we made changes to the code and used dsf and dsi instead of msf and msi for our replication. As now we are dealing with a significantly larger amount of data, our code is taking a long time to run, as of 12 Jun 12:32 PM, we still have no results. We were suspecting that part of the reason is due to the fact that we were collecting a lot of data rather than completing the process on the remote server. However, if we do not collect the data, we will not be able to perform joins with local data frames collected before. One approach we plan to try (after waiting for maybe 30 mins more) is to remove collect() for all variables and keep everything on the remote server before running the code again.