Closed GullyBurns closed 2 years ago
Hi @GullyBurns!
I am not really familiar with databricks, is there a way for me to test it?
it's weird because the model runs smoothly both locally and on Google Colab.
Hey Federico,
Our implementation is really firewalled so there's no way for you to tinker locally. Is there any way to set logging parameters in the source code to get some diagnostics?
I can ask some of our local Databricks experts to take a look and see what might be going on.
But the basic demo on the dbpedia data should build in a few minutes, right?
Gul
There's no logging implemented yet, but I can probably work on this and add some diagnostic
Yes it should take a bunch of minutes to complete, the very same demo on google colab can be easily run in a few minutes, that is why I cannot really understand what's happening.
Hello.
I had the same problem with the Databricks and the model stopped working randomly after some epochs. What worked for me was to change the number of workers on Dataloader to 0.
Tuulia
We just got confirmation from the Databricks folks about this as a solution.
ctm = CombinedTM(bow_size=len(tp.vocab), contextual_size=768, n_components=50, num_epochs=20, num_data_loader_workers=0)
Description
Trying to run the basic CTM demo for the combined TM from this CoLab notebook : https://colab.research.google.com/drive/1fXJjr_rwqvpp1IdNQ4dxqN4Dp88cxO97?usp=sharing#scrollTo=stAb2Q4eBB3W
What I Did
This is the screenshot (from databricks) that has not changed for ~30 minutes.
Running this on the standard dbpedia set just hangs on a given specific epoch: https://raw.githubusercontent.com/vinid/data/master/dbpedia_sample_abstract_20k_unprep.txt
I expected later epochs to take roughly the same amount of time as earlier ones and so when an epoch takes much longer, it seems to be a block rather than just the code taking a long time.