Closed luan-xiaokun closed 7 months ago
Hi,
Thanks for your interest in our work. "In-file negative" means the negative premise is from the same file as the positive premise. Therefore, if the positive premise is from the current file (i.e., ex["pos_premise"].path == ex["context"].path
), all negative premises in the current file are in-file negatives.
Hi, thanks for open releasing the source code of ReProver and providing a very helpful guide on training the model. I notice that the in-file negatives are critical for the performance of ReProver, as explained in the paper, and I'm trying to understand how this is implemented.
As far as I understand, given an example
ex
, this for-loop iterates over all premises that are present in the file where the context belongs, i.e., all premises inex["context"].path
.https://github.com/lean-dojo/ReProver/blob/0dbb82e3507cb8303dbb550bdb96bfbbb37e1ced/retrieval/datamodule.py#L105-L112
Here is what I found a bit confusing: when
ex["pos_premise"].path != ex["context"].path
(i.e., when the premise is imported from some other file instead of defined/proved in the same file), all of these premises we are iterating on will be added topremises_outside_file
, instead ofpremises_in_file
. This seems a bit counter-intuitive, because these premises actually come from the same file as the context, could you please explain a bit more on this?