Closed LacNguyen-Vidoori closed 2 months ago
Thanks so much! To answer this request, could you provide (either directly or by pointing to other documents) more information about the System of Systems Integration project?
Good afternoon from Suitland, MD! Apologies for the incompleteness of the questionnaire answers; below is the supplemental info:
The Census Person record-linking algorithms that we are planning to test & integrate using the datasets from pseudopeople are actually just a part of a much bigger Census initiatives for 2030: the Decennial Transformation & App Modernization (DTAM) program.
High-level objectives of this program are described here:
Please let us know if this would suffice; we'd be more than happy to provide more details on the work otherwise.
Thank you,
Lac Nguyen 301-461-8914
Get Outlook for iOShttps://aka.ms/o0ukef
From: Os Keyes @.> Sent: Friday, May 10, 2024 1:47:48 PM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Author @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification
Thanks so much! To answer this request, could you provide (either directly or by pointing to other documents) more information about the System of Systems Integration project?
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#issuecomment-2105023195, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFC6L4AKCZ3HBD37TIDZBUB4JAVCNFSM6AAAAABHJTFEC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBVGAZDGMJZGU. You are receiving this because you authored the thread.Message ID: @.***>
Works for me @aflaxman
Super, I'll send out a download link. Please email me at abie@uw.edu to let me know where to send it.
Link sent and data successfully transfered!
Good morning Abie - got this error upon attempting generate_decennial_census:
(we did specify the source parquet file's dir path according to the downloaded pseudopeople_simulated_population_usa_2_0_0 zip dir structure)
[cid:c260a220-7877-4338-8e13-28367d4c17a3]
Appreciate your guidance and support!
Regards,
Lac
_________
From: Abraham Flaxman @.> Sent: Sunday, June 2, 2024 12:06 PM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Author @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Closed #413https://github.com/ihmeuw/pseudopeople/issues/413 as completed.
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#event-13012229030, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFHQKX66NGEPZECXMETZFM7HRAVCNFSM6AAAAABHJTFEC6VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTGAYTEMRSHEYDGMA. You are receiving this because you authored the thread.
Sorry - forgot to add that this DataSourceError still persisted when we downgraded to older versions of pseudopeople and utilized the generate function (every single one of them, all the way back to 0.1.0)
From: Lac Nguyen @.> Sent: Tuesday, June 11, 2024 9:23 AM To: ihmeuw/pseudopeople @.>; ihmeuw/pseudopeople @.> Cc: Author @.>; Abraham D Flaxman @.>; Zhaojie Yin @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Good morning Abie - got this error upon attempting generate_decennial_census:
(we did specify the source parquet file's dir path according to the downloaded pseudopeople_simulated_population_usa_2_0_0 zip dir structure)
[cid:c260a220-7877-4338-8e13-28367d4c17a3]
Appreciate your guidance and support!
Regards,
Lac
_________
From: Abraham Flaxman @.> Sent: Sunday, June 2, 2024 12:06 PM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Author @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Closed #413https://github.com/ihmeuw/pseudopeople/issues/413 as completed.
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#event-13012229030, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFHQKX66NGEPZECXMETZFM7HRAVCNFSM6AAAAABHJTFEC6VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTGAYTEMRSHEYDGMA. You are receiving this because you authored the thread.
I'm sorry to hear this is not working for you! Let's see if we can get it sorted out. Can you share the exact code you used that generated this error?
Yessir - here goes:
import pseudopeople as psp source_directory ="C:\...\pseudopeople\pseudopeople_simulated_population_usa_2_0_0\pseudopeople_simulated_population_usa_2_0_0"
df = psp.generate_decennial_census(source=source_directory, config=psp.NO_NOISE, year=2020, engine='pandas')
Thank you,
Lac
From: Abraham Flaxman @.> Sent: Tuesday, June 11, 2024 11:22 AM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Assign @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
I'm sorry to hear this is not working for you! Let's see if we can get it sorted out. Can you share the exact code you used that generated this error?
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#issuecomment-2161037664, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFGZA4QVUFXQ2CATO2TZG4I4FAVCNFSM6AAAAABJEK4XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGAZTONRWGQ. You are receiving this because you were assigned.
Maybe there is an issue with your source_directory
; the three dots in C:\\...\\pseudopeople
looks suspicious to me. I recommend you try changing directory with a function from pure python and then using the psp.generate
function once you have confirmed that the directory change has succeeded:
import os
source_directory ="C:\\...\\pseudopeople\\pseudopeople_simulated_population_usa_2_0_0\\pseudopeople_simulated_population_usa_2_0_0"
os.chdir(source_directory)
df = psp.generate_decennial_census(source='.', config=psp.NO_NOISE, year=2020, engine='pandas')
If there is something wrong with the source_directory
string, this should raise FileNotFoundError
when you try to os.chdir
.
Good morning Abie - again apologies for the tardiness of this response. Meant to let you know that we've figured out what the issue was: the CHANGELOG.rst file was not included in the source file directory.
(since we were using just one source parquet file out of the 334 as a test, we neglected to copy out the CHANGELOG.rst file to go with it)
Thank you again for your guidance and support!
Lac N.
From: Abraham Flaxman @.> Sent: Tuesday, June 11, 2024 11:53 AM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Assign @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Maybe there is an issue with your source_directory; the three dots in C:\...\pseudopeople looks suspicious to me. I recommend you try changing directory with a function from pure python and then using the psp.generate function once you have confirmed that the directory change has succeeded:
import os
source_directory ="C:\...\pseudopeople\pseudopeople_simulated_population_usa_2_0_0\pseudopeople_simulated_population_usa_2_0_0" os.chdir(source_directory)
df = psp.generate_decennial_census(source='.', config=psp.NO_NOISE, year=2020, engine='pandas')
If there is something wrong with the source_directory string, this should raise FileNotFoundError when you try to os.chdir.
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#issuecomment-2161102280, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFHYU3ILVH6OE3XDQNLZG4MN7AVCNFSM6AAAAABJEK4XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGEYDEMRYGA. You are receiving this because you were assigned.
[like] Abraham D Flaxman reacted to your message:
From: LacNguyen-Vidoori @.> Sent: Friday, June 14, 2024 2:46:19 PM To: ihmeuw/pseudopeople @.> Cc: Abraham Flaxman @.>; State change @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Good morning Abie - again apologies for the tardiness of this response. Meant to let you know that we've figured out what the issue was: the CHANGELOG.rst file was not included in the source file directory.
(since we were using just one source parquet file out of the 334 as a test, we neglected to copy out the CHANGELOG.rst file to go with it)
Thank you again for your guidance and support!
Lac N.
From: Abraham Flaxman @.> Sent: Tuesday, June 11, 2024 11:53 AM To: ihmeuw/pseudopeople @.> Cc: Lac Nguyen @.>; Assign @.> Subject: Re: [ihmeuw/pseudopeople] [Data access request]: large simulated datasets (1m and 300m) (Issue #413)
Maybe there is an issue with your source_directory; the three dots in C:\...\pseudopeople looks suspicious to me. I recommend you try changing directory with a function from pure python and then using the psp.generate function once you have confirmed that the directory change has succeeded:
import os
source_directory ="C:\...\pseudopeople\pseudopeople_simulated_population_usa_2_0_0\pseudopeople_simulated_population_usa_2_0_0" os.chdir(source_directory)
df = psp.generate_decennial_census(source='.', config=psp.NO_NOISE, year=2020, engine='pandas')
If there is something wrong with the source_directory string, this should raise FileNotFoundError when you try to os.chdir.
— Reply to this email directly, view it on GitHubhttps://github.com/ihmeuw/pseudopeople/issues/413#issuecomment-2161102280https://urldefense.com/v3/__https://github.com/ihmeuw/pseudopeople/issues/413*issuecomment-2161102280*3E__;IyU!!K-Hz7m0Vt54!mnJ1oh0Sz-QS8FNsvPardeXbieFX3u-KWS-ec_44OMcJKGICn7pI2CeVvHnoDmsZmrTK56evZH1pH6G8kJs7$, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6MYFHYU3ILVH6OE3XDQNLZG4MN7AVCNFSM6AAAAABJEK4XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGEYDEMRYGAhttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AM6MYFHYU3ILVH6OE3XDQNLZG4MN7AVCNFSM6AAAAABJEK4XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGEYDEMRYGA*3E__;JQ!!K-Hz7m0Vt54!mnJ1oh0Sz-QS8FNsvPardeXbieFX3u-KWS-ec_44OMcJKGICn7pI2CeVvHnoDmsZmrTK56evZH1pH0STHMPf$. You are receiving this because you were assigned.
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ihmeuw/pseudopeople/issues/413*issuecomment-2168199799__;Iw!!K-Hz7m0Vt54!mnJ1oh0Sz-QS8FNsvPardeXbieFX3u-KWS-ec_44OMcJKGICn7pI2CeVvHnoDmsZmrTK56evZH1pH8etpZPc$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAAMQJCXFCLH4B2ENLG7GGLZHL63XAVCNFSM6AAAAABJEK4XXWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRYGE4TSNZZHE__;!!K-Hz7m0Vt54!mnJ1oh0Sz-QS8FNsvPardeXbieFX3u-KWS-ec_44OMcJKGICn7pI2CeVvHnoDmsZmrTK56evZH1pHw0KEQG0$. You are receiving this because you modified the open/close state.Message ID: @.***>
@aflaxman is this done/should I close it?
I believe so! @LacNguyen-Vidoori : please don't hesitate to re-open if you have additional points to discuss. :)
What is the name of your project?
Testing algorithms for Decennial Census record linking
What is the purpose of your project?
To assist the Census Bureau with the preparation/planning for System of Systems (SoS) Integration for the 2030 Decennial Census, our company, Vidoori Inc., is testing new algorithms that would help with linking Census records from multiple databases, in order to validate Household and Person records during Decennial operations.
Who is involved in the project? Which of these people will have direct access to the pseudopeople input data?
Myself, Lac Nguyen, and my direct supervisor, Mr. Thomas George, head of Vidoori's Data Management department.
What funding is the project under? What expectations with respect to open access and access to data come with that funding?
This is an internal Vidoori project; myself and Mr. George will be the only 2 persons that access and execute testing with this dataset on a regular basis. No other Vidoori personnel will access nor use this dataset in any way, shape or form, despite this being funded by Vidoori.
We commit to:
What data would you like to request?
Other data - more explanation
No response