Closed Lydiasc34 closed 3 years ago
Hi, you can get the raw text here https://trec.nist.gov/data/reuters/reuters.html
Hello, thank you for providing the code and link. But which one of the corpus in the link is the 'docs.txt'?
Hello, thank you for providing the code and link. But which one of the corpus in the link is the 'docs.txt'?
Did you finally got this document?
No
From: Perfect0530 @.> Sent: Thursday, November 10, 2022 11:31 AM To: morningmoni/HiLAP @.> Cc: wandli @.>; Comment @.> Subject: Re: [morningmoni/HiLAP] Missing docs.txt file for reading rcv1 data (#10)
Hello, thank you for providing the code and link. But which one of the corpus in the link is the 'docs.txt'?
Did you finally got this document?
— Reply to this email directly, view it on GitHubhttps://github.com/morningmoni/HiLAP/issues/10#issuecomment-1309721572, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGKEZIHFBP6NNFBIZWQUW7TWHRUCLANCNFSM4UFOOZJA. You are receiving this because you commented.Message ID: @.***>
Hi @morningmoni, it's still not clear where can we find the docs.txt
file used in RCV1 dataset.
Can you share some more info on it?
Thank You
Hi,
First of all, thank you for providing the code. I am looking at it for learning purpose. I found in readData_rcv1.py there is a need to get rcv text data. However, I could not find the 'docs.txt' in the hyperlink that you provided in the description. I wonder where to get the raw text data. Thanks.