Open rmlignowski345 opened 4 months ago
Update: After installing wget and modifying wgetxpt.py, it now seems that there is a problem with the module itself; once the XPT files are downloaded, pynhanes.userdata.load_data doesn't actually work, it doesn't recognize that there are files in the folder and it doesn't read them. The function that loads missing XPT's doesn't seem to recognize them either.
To dowload .xpt files, run (for example for DEMO category):
python wgetxpt.py DEMO -o XPT or: pywgetxpt DEMO -o XPT
I noticed that jupyter scripts assumed that the public mortality data were also downloaded manually to the XPT folder. I now changed the functionality so that the scripts should work even without those mortality .dat files. Please, upgrade to pynhanes-0.0.20, see details in "Quick start", and let me know if there still are problems.
pywgetxpt still was not working, I had to change line 24 to
cmd = wget.download(f"https://wwwn.cdc.gov/Nchs/Nhanes/{years}/{args.xpt}{suffix}.XPT", out=path)
or it wouldn't download anything. (Didn't give an error message which was odd, as earlier it would say the original line 25 wasn't a valid command)
After that, I was able to download it, but in the parse_userdata it wasn't reading the files, and it returned
ValueError: No objects to concatenate
and indicated that the problem was in this line:
data = pynhanes.userdata.load_data(variables, codebook, "XPT")
I commented out that line and then wrote this:
directory_in_str = './XPT/'
directory = os.fsencode(directory_in_str)
#print(directory)
data = []
for file in os.listdir(directory):
filename = os.fsdecode(file)
#print(filename)
with open('./XPT/'+str(filename), 'rb') as f:
data.append(pd.read_sas(f, format='xport'))
data = pd.concat(data)
This seemed to work but I'm not certain if the output is what it should be. The dataframe that come out in the end looks a bit weird. This was just with the ['SMQ', 'HSQ', 'DEMO'] if that matters.
Update: load_and_plot did not work, as there seemed to be a lot of 'Unkn'
's, not sure whether that's an issue with NHANES or the code
I was having some trouble getting the wgetxpt.py file to run using Jupyter Lab. I tried using
%run wgetxpt.py DEMO -out XPT
but that just created an empty folder named "ut." I also tried%run wgetxpt.py 'DEMO' -oXPT
but that didn't do anything to the XPT folder I already had. I also couldn't figure out how to run the file in Thonny (I'm not really familiar with the argparse module).Is there anything I could try that might allow me to get the XPT files?
Update: After a little more experimenting in Thonny, the problem seems to be in this line:
cmd = f"wget https://wwwn.cdc.gov/Nchs/Nhanes/{years}/{args.xpt}{suffix}.XPT -P {path}"
It gives out: 'wget' is not recognized as an internal or external command, operable program or batch file.