Closed ctmckee closed 4 years ago
also, I ran through your full notebook using your defined Data Mixture and all worked great
That error makes it sound like it is looking for a "\t" but not finding it in one of the lines (possibly the first). If you run something like
import tensorflow as tf
tf.enable_eager_execution()
ds = tf.data.TextLineDataset(path_to_your_tsv_file)
ds = ds.map(
functools.partial(tf.io.decode_csv, record_defaults=["", ""],
field_delim="\t", use_quote_delim=False),
num_parallel_calls=tf.data.experimental.AUTOTUNE)
for ex in ds:
print(ex)
break
does it produce the same error? If so, you may need to check the format of the TSV and the arguments of tf.io.decode_csv.
Hi and Thanks for the quick response. The code you provided does not produce the same error.
It produced:
(<tf.Tensor: id=1249, shape=(), dtype=string, numpy=b'What type of genome, (RNA or DNA, double stranded single stranded) is found in the the virus that causes blue tongue disease?'>, <tf.Tensor: id=1250, shape=(), dtype=string, numpy=b'double stranded, segmented RNA'>)
Hm can you replace the
print(ex)
break
with
pass
and see if it iterates through the full dataset without seeing the error you originally posted?
AHA! Thank you. my validation set is ok, but the train set produced the error when it hit this line: Which histone modifications are correlated with transcription elongation? ['H3K36me3'].
I will run a replace to take out "['" and "']"
👍
I am following your notebook on context free QA (thanks for setting that up). I am substituting your "natural_question" data set with BioASQ-training7b. I am only using "exact_answers" from BioASQ, with a length of one in the TSV files I create for training and validation. Visually these tsv files seem correct (attached png). Also, all of your cells in the notebook function with my tsv files (and I create a data mixture of the bioASQ and triviaQA). However when I run the model.fintune cell, I get the following error immediately after the Enqueue,Dequeue next (100) batches of data..:
INFO:tensorflow:Enqueue next (100) batch(es) of data to infeed. INFO:tensorflow:Dequeue next (100) batch(es) of data from outfeed. ERROR:tensorflow:Error recorded from infeed: From /job:worker/replica:0/task:0: {{function_node __inference_Datasetmap<class 'functools.partial'>_1622}} Expect 2 fields but have 1 in record 0 [[{{node DecodeCSV}}]] [[while/IteratorGetNext]]
Any suggestions?