lingdb / Sound-Comparisons

Exploring phonetic diversity across language families —
http://www.soundcomparisons.com
Other
13 stars 8 forks source link

Sound Comparisons server is incredibly slow for uploads #438

Open Linguista opened 7 years ago

Linguista commented 7 years ago

When I upload sound files to the Sound Comparisons server, using SSH and the MPI VPN, I get an average upload speed of 30 KB/s. This is incredibly slow, reminiscent of the speeds I got in the 1990s, and it's making it very hard to work (each language variety requires uploading 600-700 files, which can take an hour or more).

My internet connection has 5 Mbit of upload speed, so that isn't the problem. My computer is very fast and so the VPN/SSH overhead is negligible.

So I can only conclude that the problem is with the server itself. And needless to say, as language varieties are finished, server traffic will only increase, so this problem can only be expected to get worse.

Bibiko commented 7 years ago

This issue is not directly related to this particular server. This behaviour I've been encountering since years with several servers. At home I also have a really fast internet connection but via VPN to Jena/Göttingen the upload speed is in 98% of all cases similar to 30 kB/s. In rare cases a re-connect helps but not really. In addition at my place I also have access to an Eduroam-Wifi which I'm using for uploading due to the fact that the upload speed reaches up to 1 MB/s. This leads to the assumption that the connection speed is related to a kind of internal, infrastructural issue. I ask our IT people but they have no answer.

PaulHeggarty commented 7 years ago

Is the problem -- and therefore title of this issue -- not a more general one about the server speed in Göttingen. It's a horrible, debilitating problem with CoBL too, far and away the number 1 gripe by all our users, that the site is excruciatingly slow. And when Göttingen broke the site, and Jakob had to set up his own laptop as a replacement server as an emergency for a talk Cormac and I were giving on it in Zürich … Jakob's computer massively outran the original server in speed, it was a revelation.

So the broader issue is: we need a new server, not stymied by Göttingen. This is a critical requirement for the scientific tasks here, to stop wasting our collaborators' time, and testing their patience with data entry, and to get progress much faster, as Russell also wants. Jakob recommended we pay for our own (SSD) server. Can we just do that?

Linguista commented 7 years ago

I think we all know this, but I just uploaded a new set of recordings and the speed is as slow as ever, so this is clearly not a transient issue.

xrotwang commented 6 years ago

I wouldn't jump to conclusions here. Talking about performance without proper profiling is typically just guessing. E.g. looking at the soundcomparisons code yesterday, I saw that the existence of sound files is checked on the fly (i.e. per request) looking into the file system. With >250,000 files and a network attached storage on the server, I could imagine this to be way slower than on any machine with a local disc (SSD or not). But rather than dedicated hardware, this could also be solved by storing metadata about the sound files in the database.

Also, the particular problem of uploads via ssh and vpn directly to the server adds in quite a few possible bottlenecks - so again, nothing to simply blame on the GWDG without further investigation. Uploading to cdstar.shh.mpg.de OTOH isn't limited to MPG IP addresses, thus doesn't have to go via vpn. So even if one wanted to have the soundfiles on the server, this may be quicker by uploading to cdstar and then downloading to the server from there (within the local GWDG network).