FunctionSir / PanDefenseProject

为了消灭暴力"戒网瘾学校"等机构而努力.
https://functionsir.github.io/PanDefenseProject/
GNU Affero General Public License v3.0
77 stars 12 forks source link

Next step suggestions #11

Open vxst opened 2 weeks ago

vxst commented 2 weeks ago
  1. Establishment date and previous name. Dehumanizing detention facilities usually change their names after certain incidents.

  2. Scale, specifically how many people (approximately) are in the facility. This is useful because: a) It can be used in various reports summarizing the overall situation of dehumanizing detention facilities; b) It instantly categorizes different types of facilities, as facilities with similar numbers of people tend to behave similarly.

  3. A database for each facility. I can build a website to add individual evidence for each facility if needed. It's a good idea to keep sensitive information like contact persons (e.g., recruitment accounts) confidential to prevent misuse of the project.

vxst commented 2 weeks ago

I have the approximate scale of most facilities listed here and can be used as a start point.

I'm a little confused on how consistency is maintained. Shall we write a script to generate the json and csv file from conf? Why update all files with repeated contents?

FunctionSir commented 2 weeks ago
  1. Establishment date and previous name. Dehumanizing detention facilities usually change their names after certain incidents.

    1. Scale, specifically how many people (approximately) are in the facility. This is useful because: a) It can be used in various reports summarizing the overall situation of dehumanizing detention facilities; b) It instantly categorizes different types of facilities, as facilities with similar numbers of people tend to behave similarly.

    2. A database for each facility. I can build a website to add individual evidence for each facility if needed. It's a good idea to keep sensitive information like contact persons (e.g., recruitment accounts) confidential to prevent misuse of the project.

Thanks for your advice! But there's some problems. First, it's hard to track the name changes, it's especially harder to get the proofs about name change. Second, it's hard to know how many victims there are in the facilities. But although it's hard, it's still very good. I think we should TRY. A database is good. I think SQLite is really suitable for this project. If we can have a wiki, it will be better. I have a small computer, it's now running 7x24h, and I think I can store some evidences too. I think to use GPG mail encryption, we can protect some witnesses. If we can use GPG with i2p mail service (postman), we can have a better security, and i2p mail service will bring witnesses more privacy.

vxst commented 2 weeks ago

I have the list of numbers of people in the facilities, but as our list does not completely overlap with this project, I can only add the overlapping ones.

I can provide a server for this purpose. For the contact people and witnesses, the database should not be publicly available. It remains to be seen whether anyone will volunteer to write the website or if I should do it myself.

FunctionSir commented 2 weeks ago

I have the approximate scale of most facilities listed here and can be used as a start point.

I'm a little confused on how consistency is maintained. Shall we write a script to generate the json and csv file from conf? Why update all files with repeated contents?

Oh, that's good!!!

In fact, we have the "ZiMuProject Tool" to convert conf to json and csv. If you want to modify the csv, we have csv2ini.py to convert csv to conf. So, it's not very hard to maintain the consistency.

Maintaining different format is to bring convenience to different user and different usage. If you want to build a website and use the data, you might prefer the json file. If you want to view it in excel or do some data analyzation, you might prefer the csv file. If you want to merge two entries easier, csv is better (You can do that in some software like LibreOffice Calc, then, use csv2ini.py). If you want to add a new entry, or modify something, conf is better, you can then use the tool to gen the json and csv.

FunctionSir commented 2 weeks ago

I have the list of numbers of people in the facilities, but as our list does not completely overlap with this project, I can only add the overlapping ones.

I can provide a server for this purpose. For the contact people and witnesses, the database should not be publicly available. It remains to be seen whether anyone will volunteer to write the website or if I should do it myself.

That's good. Thanks!!!

I have a server too, and I think we can store the info in both mine and yours server. It's good to have backups or mirrors.

I think you are right, we SHOULD NOT expose ANY OF THE INFO OF WITNESSES TO THE PUBLIC. It's DANGEROUS to them if any of info exposed. And I think we need STRONG ENCRYPTION too.

vxst commented 2 weeks ago

I believe we can easily convert any structured data to JSON and CSV formats without too much effort. Wouldn't it make sense to integrate this tool into our CI pipeline and create a dedicated Git branch for the formatted data?

Given that the tool isn't open-source, I'm not sure if we need to update the schema file to add the scale/number of people field. Nevertheless, it would be much more efficient to have this process managed by CI rather than running it manually for every pull request. GitHub's CI fully supports this level of automation at no extra cost.

vxst commented 2 weeks ago

I think you are right, we SHOULD NOT expose ANY OF THE INFO OF WITNESSES TO THE PUBLIC. It's DANGEROUS to them if any of info exposed. And I think we need STRONG ENCRYPTION too.

Yep, mine is in a SoC certified data center with LUKS. If you're interested in building the services(rails+postgres maybe? Or MediaWiki if it's more convenient for contributors) I can give you access. What do you prefer?

FunctionSir commented 2 weeks ago

I believe we can easily convert any structured data to JSON and CSV formats without too much effort. Wouldn't it make sense to integrate this tool into our CI pipeline and create a dedicated Git branch for the formatted data?

Given that the tool isn't open-source, I'm not sure if we need to update the schema file to add the scale/number of people field. Nevertheless, it would be much more efficient to have this process managed by CI rather than running it manually for every pull request. GitHub's CI fully supports this level of automation at no extra cost.

That tool is 100% open source and it's under AGPLv3, in the zmp_tool dir. You can use cargo build to build that. Or you can use cargo run to just run it. You can get the tool in "Releases" too.

The tool provide a guide and defaults, and some search function, to help me maintain that. Use it can be more efficient to add new entries.

P.S. My English is REALLY POOR, so, if I made you angry, sorry, I didn't mean it.

FunctionSir commented 2 weeks ago

I think you are right, we SHOULD NOT expose ANY OF THE INFO OF WITNESSES TO THE PUBLIC. It's DANGEROUS to them if any of info exposed. And I think we need STRONG ENCRYPTION too.

Yep, mine is in a SoC certified data center with LUKS. If you're interested in building the services(rails+postgres maybe? Or MediaWiki if it's more convenient for contributors) I can give you access. What do you prefer?

I think you are right, we SHOULD NOT expose ANY OF THE INFO OF WITNESSES TO THE PUBLIC. It's DANGEROUS to them if any of info exposed. And I think we need STRONG ENCRYPTION too.

Yep, mine is in a SoC certified data center with LUKS. If you're interested in building the services(rails+postgres maybe? Or MediaWiki if it's more convenient for contributors) I can give you access. What do you prefer?

It's Dokuwiki available? I use Dokuwiki more often.

vxst commented 2 weeks ago

P.S. My English is REALLY POOR, so, if I made you angry, sorry, I didn't mean it.

I'm sorry for the misunderstanding but I'm not angry at all. Just wondering whether you'll set up the pipeline or should I make a PR for that?

I'm a little occupied right now and can code the PR tomorrow, if you can write it yourself it would be great!

vxst commented 2 weeks ago

It's Dokuwiki available? I use Dokuwiki more often.

I think I can set up a DokuWiki and write an import script for the combined list. It should be done by this weekend if nothing unexpected pops up in my calendar. :-)

vxst commented 2 weeks ago

I'm sorry for the misunderstanding but I'm not angry at all. Just wondering whether you'll set up the pipeline or should I make a PR for that?

@FunctionSir After reviewing the code, which is interactive, I believe it would be better if you could add a CLI interface to it. I'm more of a Go enthusiast and I'm not really fond of Rust (no disrespect).

xioi commented 2 weeks ago

I'm sorry for the misunderstanding but I'm not angry at all. Just wondering whether you'll set up the pipeline or should I make a PR for that?

@FunctionSir After reviewing the code, which is interactive, I believe it would be better if you could add a CLI interface to it. I'm more of a Go enthusiast and I'm not really fond of Rust (no disrespect).

@vxst @FunctionSir I've seen you pinned some Python repos in your profile page, what about using Python? A large amount of people use it and even high school students learn it due to education standards. So that we can attract more contributors who fear Rust if we rewrite tdp_tool in Python.

我看到你在资料页面pin了一些Python repo、所以你觉得用Python怎么样? Python使用者基数巨大、而且即便是高中生也会因为课程标准而学习它。所以如果我们用Python重写tdp_tool的话,这可以吸引先前因为Rust而感到无力的贡献者。

xioi commented 2 weeks ago

And what is most important is to find a common language we can collaborate together, which promotes code reuse and contributors attraction.

xioi commented 2 weeks ago

Ah I've forgotten that FuncSir has the ability to code in Go...It's a good choice as well and I've also written some Go code before. It's fine.

vxst commented 2 weeks ago

Ah I've forgotten that FuncSir has the ability to code in Go...It's a good choice as well and I've also written some Go code before. It's fine.

Of course, I can rewrite the script in Python, but it's generally inappropriate in the open source world to ask others to change languages or override their work on an already completed project solely based on language preference. I think Rust is suitable for the conversion job, and so far, FunctionSir's code has functioned perfectly. It just needs some adjustments to invoke it with CI, and I prefer not to fight with the Rust compiler (call me stubborn), so the project author is the best person to handle this job.

xioi commented 2 weeks ago

Ah I've forgotten that FuncSir has the ability to code in Go...It's a good choice as well and I've also written some Go code before. It's fine.

Of course, I can rewrite the script in Python, but it's generally inappropriate in the open source world to ask others to change languages or override their work on an already completed project solely based on language preference. I think Rust is suitable for the conversion job, and so far, FunctionSir's code has functioned perfectly. It just needs some adjustments to invoke it with CI, and I prefer not to fight with the Rust compiler (call me stubborn), so the project author is the best person to handle this job.

You're right. It depends on FunctionSir.

FunctionSir commented 2 weeks ago

I think Python 3 is good. Python is easy to code, you can code very fast, and there are a lot of libs. But the problem is you need to do some "pip install" things, before you can run it, and python is so slow... Go is good too, especially it is a compile lang, you can get bins and run them easily. The problem is Go might be harder than Python, and you need some magic to access proxy.golang.org in China Mainland... Go is even simpler, and easier than Python like it has a easier grammar. How ever, KISS is the most important.

FunctionSir commented 2 weeks ago

And yes, Rust is so hard... And a kind of "wordy"...