TimZaman / dotaclient

distributed RL spaghetti al arabiata
26 stars 7 forks source link

gcp upload of tb file too big #13

Closed TimZaman closed 5 years ago

TimZaman commented 5 years ago
2019-01-13 16:42:25,801 INFO     steps_per_s=176.02, avg_weight_age=6.12, mean_reward=-0.60, loss=0.7546
Traceback (most recent call last):
  File "optimizer.py", line 514, in <module>
    mq_prefetch_count=args.mq_prefetch_count,
  File "optimizer.py", line 493, in main
    dota_optimizer.run()
  File "optimizer.py", line 315, in run
    self.step(experiences=experiences)
  File "optimizer.py", line 429, in step
    blob.upload_from_filename(filename=self.events_filename)
  File "/root/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1136, in upload_from_filename
    predefined_acl=predefined_acl,
  File "/root/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1081, in upload_from_file
    client, file_obj, content_type, size, num_retries, predefined_acl
  File "/root/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 991, in _do_upload
    client, stream, content_type, size, num_retries, predefined_acl
  File "/root/.local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 938, in _do_resumable_upload
    response = upload.transmit_next_chunk(transport)
  File "/root/.local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 392, in transmit_next_chunk
    method, url, payload, headers = self._prepare_request()
  File "/root/.local/lib/python3.7/site-packages/google/resumable_media/_upload.py", line 530, in _prepare_request
    self._stream, self._chunk_size, self._total_bytes)
  File "/root/.local/lib/python3.7/site-packages/google/resumable_media/_upload.py", line 825, in get_next_chunk
    raise ValueError(msg)
ValueError: 8682306 bytes have been read from the stream, which exceeds the expected total 8664934.
TimZaman commented 5 years ago

Definitely not related to from google.cloud import storage. Relates to TensorboardX. Maybe flushing?

TimZaman commented 5 years ago

Can reproed:

writer = SummaryWriter()

for i in range(32):
    writer.add_text('test', 'x' * 1000000, i)
    blob = bucket.blob('foo2.events')
    blob.upload_from_filename(filename=writer.file_writer.event_writer._ev_writer._file_name)
TimZaman commented 5 years ago

Seemed fixed with a flush:

writer = SummaryWriter()

for i in range(32):
    writer.add_text('test', 'x' * 1000000, i)
    writer.file_writer.flush()
    blob = bucket.blob('foo2.events')
    blob.upload_from_filename(filename=writer.file_writer.event_writer._ev_writer._file_name)