issues
search
bigscience-workshop
/
data_tooling
Tools for managing datasets for governance and training.
Apache License 2.0
78
stars
48
forks
source link
Create dataset lihkg
#345
Open
albertvillanova
opened
2 years ago
albertvillanova
commented
2 years ago
uid: lihkg
type: primary
description:
name: LIHKG
description: Most Popular Hong Kong Forum
homepage:
https://lihkg.com/category/1
validated: True
languages:
language_names:
Chinese
Yue Chinese, Cantonese
language_comments: Cantonese
language_locations:
Eastern Asia
Hong Kong
validated: False
custodian:
name: 許業珩
in_catalogue:
type: A private individual
location: Hong Kong
contact_name: 許業珩
contact_email:
contact_submitter: False
additional:
https://lihkg.com/thread/1660096/page/1
validated: False
availability:
procurement:
for_download: No - but the current owners/custodians have contact information for data queries
download_url:
download_email:
https://t.me/lihkg_official
licensing:
has_licenses: Unclear
license_text:
license_properties:
license_list:
pii:
has_pii: Yes - text author name only
generic_pii_likely:
generic_pii_list:
numeric_pii_likely:
numeric_pii_list:
sensitive_pii_likely:
sensitive_pii_list:
no_pii_justification_class: general knowledge not written by or referring to private persons
no_pii_justification_text:
validated: False
source_category:
category_type: website
category_web: forum
category_media:
validated: False
media:
category:
text
image
text_format:
.HTML
audiovisual_format:
image_format:
.JPEG
database_format:
text_is_transcribed: No
instance_type: post
instance_count: 100K<n<1M
instance_size: 10<n<100
validated: False
fname: lihkg.json