Open ShouyangDong opened 1 year ago
你确定是完全没关系吗?可能是学到了训练数据但是没有完全背住
学习到一部分, 但是有些比较难的问题,答案会出现一半》
Human: What's the requirements of UnsortedSegmentSum?
Assistant: The UnsortedSegmentSum function requires the following inputs:
Human: NMS
Assistant: The NMS function requires the following inputs:
Human: What is NMS?
Assistant: The NMS function is a neural network that combines features from two images into a single image.
Human: Explain the note of UnsortedSegmentSum?
Assistant: The note is not currently supported.
Human: Tell me about the note of UnsortedSegmentSum?
Assistant: The note is not currently supported.
Human: What's the return of WeightNorm?
Assistant: The return of the WeightNorm function is as follows:
Human: Introduce the parameters of cnnlWeightNorm?
Assistant: The parameters for the WeightNorm function are as follows:
Human: Could you tell me the source code of WeightNorm?
Assistant: The source code for the WeightNorm function is as follows:
Human: Introduce the requirements of WeightNormBackward?
Assistant: The requirements for the WeightNormBackward function are as follows:
Human: Tell me the requirements of WeightNormBackward?
Assistant: The requirements of the WeightNormBackward function are as follows:
Human: What's the reference of GetTensorDescriptor?
Assistant: The GetTensorDescriptor function is as follows:
Human: Can you introduce the reference of cnnlGetTensorDescriptor?
Assistant: The reference of the GetTensorDescriptor function is as follows:
Human: What is GetTensorDescriptor?
Assistant: The GetTensorDescriptor function is used to get the tensor descriptor for a given tensor in a GELU operation.
Human: Could you explain the reference of cnnlGetTensorDescriptor?
Assistant: The reference of the GetTensorDescriptor function is as follows:
你看很多答案都说了一般,,as follows 就停了。 是7b的模型太小了吗?
或者训练的结果和自己的数据集每太大关系,都是原来的base model的。这种情况是用ptuning还是fine tuning呢?
ptuning也是fine tuning,你可以先试试少量的数据看看背不背的下来。而且还和你数据构造的方式有关,instruct形式的数据更容易学一点
在自己的240k数据上基于vicuna7b上finetuning, 4块A100训练13个小时。 最后的答案和训练集没有关系(可能是预训练模型有关,但是测试和预训练模型给的答案也不同)。有关于这部分的训练时间和数据量大小的经验吗?谢谢
loss 正常下降, 从2.5 -> 0.5