model/resrarch/astronet Typeerror:buffer is too small for requested array

surfound commented 6 years ago

when I excute this command bazel-bin/astronet/data/generate_input_records --input_tce_csv_file=${TCE_CSV_FILE} --kepler_data_dir=${KEPLER_DATA_DIR} --output_dir=${TFRECORD_DIR} --num_worker_processes=5

It normal work for a period of time. INFO:tensorflow:PoolWorker-3: Wrote 1573 items in shard train-00002-of-00008 INFO:tensorflow:PoolWorker-2: Processed 1430/1573 items in shard train-00005-of-00008 INFO:tensorflow:PoolWorker-5: Processed 380/1574 items in shard test-00000-of-00001 INFO:tensorflow:PoolWorker-1: Processed 1540/1573 items in shard train-00000-of-00008 INFO:tensorflow:PoolWorker-4: Processed 1570/1574 items in shard train-00003-of-00008 INFO:tensorflow:PoolWorker-2: Processed 1440/1573 items in shard train-00005-of-00008 INFO:tensorflow:PoolWorker-5: Processed 390/1574 items in shard test-00000-of-00001 INFO:tensorflow:PoolWorker-1: Processed 1550/1573 items in shard train-00000-of-00008 INFO:tensorflow:PoolWorker-4: Wrote 1574 items in shard train-00003-of-00008 INFO:tensorflow:PoolWorker-2: Processed 1450/1573 items in shard train-00005-of-00008 INFO:tensorflow:PoolWorker-1: Processed 1560/1573 items in shard train-00000-of-00008 INFO:tensorflow:PoolWorker-5: Processed 400/1574 items in shard test-00000-of-00001 INFO:tensorflow:PoolWorker-2: Processed 1460/1573 items in shard train-00005-of-00008 INFO:tensorflow:PoolWorker-5: Processed 410/1574 items in shard test-00000-of-00001 INFO:tensorflow:PoolWorker-1: Processed 1570/1573 items in shard train-00000-of-00008 INFO:tensorflow:PoolWorker-2: Processed 1470/1573 items in shard train-00005-of-00008 INFO:tensorflow:PoolWorker-1: Wrote 1573 items in shard train-00000-of-00008

Then a error occurred. Traceback (most recent call last): File "/home/stuf/astronet/bazel-bin/astronet/data/generate_input_records.runfiles/main/astronet/data/generate_input_records.py", line 301, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 124, in run _sys.exit(main(argv)) File "/home/stuf/astronet/bazel-bin/astronet/data/generate_input_records.runfiles/main/astronet/data/generate_input_records.py", line 293, in main async_result.get() File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value TypeError: buffer is too small for requested array

The previous steps are all normal.Only this...

tensorflowbutler commented 6 years ago

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks. What is the top-level directory of the model you are using Have I written custom code OS Platform and Distribution TensorFlow installed from TensorFlow version Bazel version CUDA/cuDNN version GPU model and memory Exact command to reproduce

surfound commented 6 years ago

What is the top-level directory of the model you are using /model / resrarch / astronet
Have I written custom code N/A OS Platform and Distribution ubuntu 16.04 LTS TensorFlow installed from https://github.com/tensorflow/tensorflow TensorFlow version 1.5.0 Bazel version Build label: 0.11.1 CUDA/cuDNN version N/A GPU model and memory only CPU Exact command to reproduce see it above

it has run for some hours. TypeError: buffer is too small for requested array

fsolecki commented 6 years ago

Hi,

I have the same issue:

Issue Template What is the top-level directory of the model you are using: models/research/astronet Have I written custom code: no OS Platform and Distribution: Linux Mint 18 Cinnamon 64-bit, kernel 4.4.0-122 TensorFlow installed from: sources from https://github.com/tensorflow/tensorflow TensorFlow version: 1.8.0-rc1 Bazel version: 0.13.0 CUDA/cuDNN version: N/A GPU model and memory: CPU: Intel Core i5-3337U, no GPU, Memory: 8GiB Exact command to reproduce:

1. Download Kepler Data Use the script from README.md to download 90GB of data:

TCE_CSV_FILE="${HOME}/astronet/dr24_tce.csv"

KEPLER_DATA_DIR="${HOME}/astronet/kepler/"

python astronet/data/generate_download_script.py \
  --kepler_csv_file=${TCE_CSV_FILE} \
  --download_dir=${KEPLER_DATA_DIR}

./get_kepler.sh

with attached file (exported as xlsx instead of csv as github does not accept csv files): dr24_tce.xlsx

2. Process Kepler Data Use the script from README.md:

bazel build astronet/...

TFRECORD_DIR="${HOME}/astronet/tfrecord"

bazel-bin/astronet/data/generate_input_records \
  --input_tce_csv_file=${TCE_CSV_FILE} \
  --kepler_data_dir=${KEPLER_DATA_DIR} \
  --output_dir=${TFRECORD_DIR} \
  --num_worker_processes=5

After several hours, the script stops with:

INFO:tensorflow:PoolWorker-2: Processed 50/1574 items in shard val-00000-of-00001
INFO:tensorflow:PoolWorker-3: Processed 90/1573 items in shard train-00005-of-00008
INFO:tensorflow:PoolWorker-1: Processed 50/1574 items in shard test-00000-of-00001
WARNING: File may have been truncated: actual file length (176345) is smaller than the expected size (457920) [astropy.io.fits.file]
Traceback (most recent call last):
  File "${HOME}/repositories/models/research/astronet/bazel-bin/astronet/data/generate_input_records.runfiles/__main__/astronet/data/generate_input_records.py", line 302, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "${HOME}/repositories/models/research/astronet/bazel-bin/astronet/data/generate_input_records.runfiles/__main__/astronet/data/generate_input_records.py", line 294, in main
    async_result.get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
TypeError: buffer is too small for requested array

Thanks for your time

tsapin-logic commented 6 years ago

any progress on this one?

surfound commented 6 years ago

Unfortunately, i have no progress. Because i have to do other work. I intend asking my teacher for help next week.

And if i know how to solve this question,i will tell you by email.

2018-05-11 2:37 GMT+08:00 tsapin-logic notifications@github.com:

any progress on this one?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/3800#issuecomment-388144845, or mute the thread https://github.com/notifications/unsubscribe-auth/AkIfp0Xnm-pBeQ6_tqS4OVNntAzD5pS3ks5txIjqgaJpZM4S_nmR .

fsolecki commented 6 years ago

The issue was in kepler's data. Some of the .fits files were not correctly downloaded.

To check the downloaded files you can:

create the file astronet/data/check_data.py:

# Copyright 2018 The TensorFlow Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Script to check data from the Kepler space telescope."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse
import os
import sys

import tensorflow as tf

from light_curve_util import kepler_io

parser = argparse.ArgumentParser()

parser.add_argument(
    "--kepler_data_dir",
    type=str,
    required=True,
    help="Base folder containing Kepler data.")

def _check_fits(file_name):
  """Try to read the fits file and print an error if the file is not valid.

  Args:
    file_name: The input .fits file 
  """

  try:
    kepler_io.read_kepler_light_curve([file_name])
  except:
    tf.logging.info("Error in file %s", file_name)

def main(argv):
  del argv  # Unused.

  num_file = 0  

  for path, subdirs, files in os.walk(FLAGS.kepler_data_dir):
    for name in files:
        num_file = num_file + 1
  tf.logging.info("Finished checking %d total file", num_file)

if __name__ == "__main__":
  tf.logging.set_verbosity(tf.logging.INFO)
  FLAGS, unparsed = parser.parse_known_args()
  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

update the BUILD file in astronet/data with :

py_binary(
name = "check_data",
srcs = ["check_data.py"],
deps = [ "//light_curve_util:kepler_io"],
)

rebuild with
```
bazel build astronet/...
```

use the script check_data.sh:

bazel-bin/astronet/data/check_data --kepler_data_dir=${KEPLER_DATA_DIR}

surfound commented 6 years ago

OK. I will try. But I have download it by the script in the folder.

fsolecki notifications@github.com 于2018年10月4日周四上午11:38写道：

The issue was in kepler's data. Some of the .fits file were not correctly downloaded.

To check the downloaded files you can:

create the file astronet/data/check_data.py:

Copyright 2018 The TensorFlow Authors.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.

"""Script to check data from the Kepler space telescope.""" from future import absolute_importfrom future import divisionfrom future import print_function import argparseimport osimport sys import tensorflow as tf from light_curve_util import kepler_io

parser = argparse.ArgumentParser()

parser.add_argument( "--kepler_data_dir", type=str, required=True, help="Base folder containing Kepler data.")

def _check_fits(file_name): """Try to read the fits file and print an error if the file is not valid. Args: file_name: The input .fits file """

try: kepler_io.read_kepler_light_curve([file_name]) except: tf.logging.info("Error in file %s", file_name) def main(argv): del argv # Unused.

num_file = 0

for path, subdirs, files in os.walk(FLAGS.kepler_data_dir): for name in files: num_file = num_file + 1 tf.logging.info("Finished checking %d total file", num_file) if name == "main": tf.logging.set_verbosity(tf.logging.INFO) FLAGS, unparsed = parser.parse_known_args() tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

update the BUILD file in astronet/data with :

py_binary( name = "check_data", srcs = ["check_data.py"], deps = [ "//light_curve_util:kepler_io"], )

rebuild with

bazel build astronet/...

use the script check_data.sh:

bazel-bin/astronet/data/check_data --kepler_data_dir=${KEPLER_DATA_DIR}

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/models/issues/3800#issuecomment-426874374, or mute the thread https://github.com/notifications/unsubscribe-auth/AkIfpwA6S5PxruEkgJGdFT3sXRWs7vRUks5uhYLEgaJpZM4S_nmR .

fsolecki commented 6 years ago

You must first download the data using the script in general instruction (get_kepler.sh) then check what you have downloaded with the script above.

tensorflowbutler commented 4 years ago

Hi There, We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing. If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.

tensorflow / models

model/resrarch/astronet Typeerror:buffer is too small for requested array #3800