allenai / s2-folks

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
Other
144 stars 25 forks source link

Q: Construct dictionary mapping paragraph to sectionheader #194

Open biaoyanf opened 2 months ago

biaoyanf commented 2 months ago

Hi,

Is it possible to map the paragraphs to the corresponding sections on the s2orc dataset?

I download the s2orc dataset and I can find the full text, paragraph, and sectionheader. However, I cannot find the information mapping the paragraph to the corresponding section (or sectionheader).

Thanks.

cfiorelli commented 1 month ago

@biaoyanf Hello, Thank you for this interesting question. I vaguely recall having this same issue myself last time i was working with s2orc data. I might have tried to come up with some regex logic to index paragraphs, but it has been some time. have you found a solution yet?